what's the correct way to handle unicode? - trying to print out graphemes here.

Steven Schveighoffer schveiguy at yahoo.com
Tue Jul 3 14:43:37 UTC 2018


On 7/3/18 10:37 AM, ag0aep6g wrote:
> On Tuesday, 3 July 2018 at 13:32:52 UTC, aliak wrote:
>> foreach (c; "👩‍👩‍👦‍👦🏳️‍🌈") {
>>   writeln(c);
>> }
>>
>> So basically the above just doesn't work. Prints gibberish.
> 
> Because you're printing one UTF-8 code unit (`char`) per line.
> 
>> So I figured, std.uni.byGrapheme would help, since that's what they 
>> are, but I can't get it to print them back out? Is there a way?
>>
>> foreach (c; "👩‍👩‍👦‍👦🏳️‍🌈".byGrapheme) {
>>   writeln(c.<????>);
>> }
> 
> You're looking for `c[]`. But that won't work, because std.uni 
> apparently doesn't recognize those as grapheme clusters. The emojis may 
> be too new. std.uni is based on Unicode version 6.2, which is a couple 
> years old.

Oops! I didn't realize this, ignore my message about reporting a bug.

I still think it's very odd for printing a grapheme to print the data 
structure.

-Steve


More information about the Digitalmars-d-learn mailing list