|
bstring 1.1.0
|
Interface for basic Unicode utility functions for bstrings. More...
Go to the source code of this file.
Functions | |
| BSTR_PUBLIC int | buIsUTF8Content (const bstring bu) |
| BSTR_PUBLIC int | buAppendBlkUcs4 (bstring b, const cpUcs4 *bu, int len, cpUcs4 errCh) |
| BSTR_PUBLIC int | buGetBlkUTF16 (cpUcs2 *ucs2, int len, cpUcs4 errCh, const bstring bu, int pos) |
| BSTR_PUBLIC int | buAppendBlkUTF16 (bstring bu, const cpUcs2 *utf16, int len, cpUcs2 *bom, cpUcs4 errCh) |
Interface for basic Unicode utility functions for bstrings.
Depends on bstrlib.h and utf8util.h.
| BSTR_PUBLIC int buAppendBlkUcs4 | ( | bstring | b, |
| const cpUcs4 * | bu, | ||
| int | len, | ||
| cpUcs4 | errCh ) |
Convert an array of UCS-4 code points (bu, len elements) to UTF-8 and append the result to the bstring b.
Any invalid code point is replaced by errCh. If errCh is itself not a valid code point, translation halts on the first error and BSTR_ERR is returned. Otherwise BSTR_OK is returned.
| BSTR_PUBLIC int buAppendBlkUTF16 | ( | bstring | bu, |
| const cpUcs2 * | utf16, | ||
| int | len, | ||
| cpUcs2 * | bom, | ||
| cpUcs4 | errCh ) |
Append an array of UTF-16 code units (utf16, len elements) to the UTF-8 bstring bu.
Any invalid code point is replaced by errCh. If errCh is itself not a valid code point, translation halts on the first error and BSTR_ERR is returned. Otherwise BSTR_OK is returned. If a byte order mark has been previously read it may be passed in via bom; if *bom is 0 it will be filled in from the first character if it is a BOM.
| BSTR_PUBLIC int buGetBlkUTF16 | ( | cpUcs2 * | ucs2, |
| int | len, | ||
| cpUcs4 | errCh, | ||
| const bstring | bu, | ||
| int | pos ) |
Convert the UTF-8 bstring bu (starting at code-point offset pos) to a sequence of UTF-16 encoded code units written to ucs2 (at most len units).
Returns the number of UCS-2 16-bit words written. Any unparsable code point is translated to errCh.
| BSTR_PUBLIC int buIsUTF8Content | ( | const bstring | bu | ) |
Scan a bstring and return 1 if its entire content consists of valid UTF-8 encoded code points, otherwise return 0.