Major performance problem with std.array.front()

Vladimir Panteleev vladimir at thecybershadow.net
Fri Mar 7 17:39:00 PST 2014


On Saturday, 8 March 2014 at 01:23:27 UTC, Andrei Alexandrescu 
wrote:
> Yup, the grapheme issue. This should work.

No. It does not work because grapheme segmentation is not the 
same as normalization. Even if you fix the code (should be: 
assert(s.byGrapheme.canFind!"a[] == b"("é"))), it will not work 
because byGrapheme does not normalize (and not all graphemes can 
be normalized to a single code point anyway). And there is more 
than one type of normalization - you need to use the one 
depending on what you're trying to achieve.

> Graphemes are the next level of Nirvana above code points, but 
> that doesn't mean it's graphemes or nothing.

It's not about types, it's about algorithms. It's never "X or 
nothing" - unless X is "actually understanding Unicode". 
Everything else is a compromise.

Compromises are acceptable, but not when they are built into the 
language as the standard way of working with text, thus hiding 
the problems that come with them.


More information about the Digitalmars-d mailing list