Converting Unicode Escape Sequences to UTF-8
anonymous via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Thu Oct 22 11:40:05 PDT 2015
On Thursday, October 22, 2015 08:10 PM, Nordlöw wrote:
> How do I convert a `string` containing Unicode escape sequences
> such as "\uXXXX" into UTF-8?
Ali explained that "\uXXXX" is already UTF-8.
But if you actually want to interpret such escape sequences from user input
or some such, then find all occurrences, and for each of them do:
* Drop the backslash and the 'u'.
* Parse XXXX as a hexadecimal integer, and cast to dchar.
* Use std.utf.encode to convert to UTF-8. std.conv.to can probably do it
too, and possibly simpler, but would allocate.
Also be aware of the longer variant with a capital U: \UXXXXXXXX (8 Xs)
More information about the Digitalmars-d-learn
mailing list