How to detect start of Unicode symbol and count amount of graphemes
Uranuz via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sun Oct 5 05:09:33 PDT 2014
> You can use std.uni.byGrapheme to iterate by graphemes:
> http://dlang.org/phobos/std_uni.html#.byGrapheme
>
> AFAIK, graphemes are not "self synchronizing", but codepoints
> are. You can pop code units until you reach the beginning of a
> new codepoint. From there, you can iterate by graphemes, though
> your first grapheme might be off.
Maybe there is some idea how to just detect first code unit of
grapheme without overhead for using Grapheme struct? I just tried
to check if ch < 128 (for UTF-8). But this dont work. How to
check if byte is continuation of code for single code point or if
new sequence started?
More information about the Digitalmars-d-learn
mailing list