dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead

H. S. Teoh hsteoh at quickfur.ath.cx
Sat Nov 6 05:36:07 UTC 2021


On Sat, Nov 06, 2021 at 04:18:51AM +0000, Alexey via Digitalmars-d wrote:
> On Saturday, 6 November 2021 at 04:07:35 UTC, Alexey wrote:
> 
> > 3. IMHO, one grapheme should be and alias to ubyte[] or to one BigInt;
> 
> or may be, even, define one grapheme as dchar[]. or maybe, even, define new
> separate type for 'codepoint' and define one grapheme as codepoint[].

Unfortunately, codepoint != grapheme. This was the fundamental error
with autodecoding that made it so bad. It costs us a performance hit but
doesn't even produce the right results in return.

And even more unfortunately, grapheme segmentation is an extremely
convoluted (i.e. slow) operation that normally you would *not* want to
do it unless your code absolutely has to.


T

-- 
Let's eat some disquits while we format the biskettes.


More information about the Digitalmars-d mailing list