parser library provides own implementation of UTF-8 support so it does not depend on other Unicode librarires.
Library defines utf8char and utf8string based on unsigned char:
typedef unsigned char utf8char;
typedef std::basic_string<utf8char> utf8string;
Checks if character is a begining of a multibyte codepoint or if it is a part of a multibyte codepoint.
Converts const utf8char* sequence s of 1-4 utf8 code units into codepoint ch.
sis a string of unsigned chars containing utf8 textlis a size of the s stringchis a reference to a codepoint which is populated by the result
Returns a byte size of a codepoint (0 - 4 bytes) or (size_t) -1 if illegal UTF-8 code unit.
Returns a byte size (0 - 4) of a Unicode codepoint ch in bytes.
Converts an Unicode codepoint ch to an UTF-8 sequence of unsigned chars into an address s. Be sure to have an allocated space of at least 4 bytes.
Returns a byte size of the codepoint ch.
Converts ch unicode codepoint to a 1-4 utf8chars and puts them into a stream os
Returns the stream os.
Converts ch unicode codepoint to 1 - 4 utf8chars and pushes them into a referenced vector v.
Returns the reference to the v vector
Stream << operator converting a C string s into u32 before putting it into the u32 stream os.
Stream << operator converting a std::string s into u32 before putting it into the u32 stream os.
utf8string to_utf8string(int32_t v);
utf8string to_utf8string(char ch);
utf8string to_utf8string(const char* s);
utf8string to_utf8string(const std::string& s);
utf8string to_utf8string(char32_t ch);
utf8string to_utf8string(const std::u32string& str);
std::u32string to_u32string(const std::u32string& str);
std::u32string to_u32string(const utf8string& str);
std::u32string to_u32string(const std::string& str);
std::string to_string(const std::string& s);
std::string to_string(const utf8string& s);
std::string to_string(const std::u32string& s);
std::string to_string(const char32_t& s);
std::string to_std_string(const std::string& s);
std::string to_std_string(const utf8string& s);
std::string to_std_string(const std::u32string& s);
std::string to_std_string(const char32_t& ch);
templated conversions of chars and strings based on char or char32_t to strings based on char or char32_t. This is useful for specifying a result type during a compilation.
template <typename CharT>
typename std::basic_string<CharT> from_cstr(const char *);
template <typename CharT>
typename std::basic_string<CharT> from_cstr(const char32_t *);
template <typename CharT>
typename std::basic_string<CharT> from_str(const std::string&);
template <typename CharT>
typename std::basic_string<CharT> from_str(const std::u32string&);