How to detect start of Unicode symbol and count amount of graphemes

Kagamin via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Mon Oct 6 07:56:52 PDT 2014


On Sunday, 5 October 2014 at 12:09:34 UTC, Uranuz wrote:
> Maybe there is some idea how to just detect first code unit of 
> grapheme without overhead for using Grapheme struct? I just 
> tried to check if ch < 128 (for UTF-8). But this dont work. How 
> to check if byte is continuation of code for single code point 
> or if new sequence started?

Are you trying to split strings? If you want to optimize usage of 
graphemes, try to check if 10 code units contain ascii symbol; 
when that fails, fall back to graphemes.


More information about the Digitalmars-d-learn mailing list