dmd foreach loops throw exceptions on invalid UTF sequences, use replacementDchar instead

Alexey invalid at email.address
Sat Nov 6 04:07:35 UTC 2021


On Thursday, 4 November 2021 at 02:26:20 UTC, Walter Bright wrote:
> https://issues.dlang.org/show_bug.cgi?id=22473
>
> I've tried to fix this before, but too many people objected.
>
> Are we fed up with this yet? I sure am.
>
> Who wants to take up this cudgel and fix the durned thing once 
> and for all?
>
> (It's unclear if it would even break existing code.)

I didn't read thread. And I'm not an expert in D or Unicode, of 
course.

But If I would need to solve the problem of unicode handling, I 
would do the following:

1. define type for the 'grapheme' - so grapheme could store any 
unicode symbol;
2. define string of grapheme as array of grapheme, so programmer 
could at any time use usual array tools on those. like so things 
like .length and slicing [x..y] work as usual. call this, for 
instance, 'gstring' or 'graphstring';
3. IMHO, one grapheme should be and alias to ubyte[] or to one 
BigInt;
4. conversion from string/wstring/dstring/ubyte[]/BigInt[]/etc to 
['gstring' or 'graphstring'] should be automatic and this should 
be stated in documentation;
5. ['gstring' or 'graphstring'] should have functions to convert 
to string/wstring/dstring/ubyte[]/BigInt[]/etc


More information about the Digitalmars-d mailing list