The Case Against Autodecode

ag0aep6g via Digitalmars-d digitalmars-d at puremagic.com
Thu Jun 2 12:48:47 PDT 2016


On 06/02/2016 09:26 PM, Andrei Alexandrescu wrote:
> ag0aep6g <anonymous at example.com> wrote:
>> On 06/02/2016 09:05 PM, Andrei Alexandrescu wrote:
>>> Pretty much everything. Consider s and s1 string variables with possibly
>>> different encodings (UTF8/UTF16).
>>>
>>> * s.all!(c => c == 'ö') works only with autodecoding. It returns always
>>> false without.
>>
>> Doesn't work with autodecoding (to code points) when a combining
>> diaeresis (U+0308) is used in s.
>
> Works if s is normalized appropriately. No?

Works when normalized to precomposed characters, yes.

That's not a given, of course. When the user is aware enough to 
normalize their strings that way, then they should be able to call 
byDchar explicitly.

And of course you can't do s.all!(c => c == 'a⃗'), despite a⃗ looking like 
one character. Need byGrapheme for that.


More information about the Digitalmars-d mailing list