Converting Unicode Escape Sequences to UTF-8

Ali Çehreli via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Oct 22 11:17:44 PDT 2015


On 10/22/2015 11:10 AM, Nordlöw wrote:
> How do I convert a `string` containing Unicode escape sequences such as
> "\uXXXX" into UTF-8?

It's already UTF-8 because it's a 'string'. :)

import std.stdio;

void main() {
     auto s = "\u1234";

     foreach (codeUnit; s) {
         writefln("%02x %08b", codeUnit, codeUnit);
     }
}

The output has three code units for "U+1234 ETHIOPIC SYLLABLE SEE", not 
two bytes:

e1 11100001
88 10001000
b4 10110100

Ali



More information about the Digitalmars-d-learn mailing list