what's the correct way to handle unicode? - trying to print out graphemes here.

Steven Schveighoffer schveiguy at yahoo.com
Tue Jul 3 14:31:08 UTC 2018


On 7/3/18 9:32 AM, aliak wrote:
> Hi, trying to figure out how to loop through a string of characters and 
> then spit them back out.
> 
> Eg:
> 
> foreach (c; "👩‍👩‍👦‍👦🏳️‍🌈") {
>    writeln(c);
> }
> 
> So basically the above just doesn't work. Prints gibberish.
> 
> So I figured, std.uni.byGrapheme would help, since that's what they are, 
> but I can't get it to print them back out? Is there a way?
> 
> foreach (c; "👩‍👩‍👦‍👦🏳️‍🌈".byGrapheme) {
>    writeln(c.<????>);
> }
> 
> And then if I type the loop variable as dchar,  then it seems that the 
> family empji is printed out as 4 faces - so the code points I guess - 
> and the rainbow flag is other stuff (also its code points I assume)

Yeah, it appears that you can't actually print a grapheme. I would have 
assumed writeln(c) works. It does work, it just prints the struct data 
instead of converting back to utf.

> Is there a type that I can use to store graphemes and then output them 
> as a grapheme as well? Or do I have to use like lib ICU maybe or 
> something similar?

I honestly can't figure it out. I think directly writing graphemes as 
viewable UTF was not something that was considered.

Definitely needs a bugzilla issue.

-Steve


More information about the Digitalmars-d-learn mailing list