The Case Against Autodecode
tsbockman via Digitalmars-d
digitalmars-d at puremagic.com
Thu Jun 2 13:36:01 PDT 2016
On Thursday, 2 June 2016 at 20:13:14 UTC, Andrei Alexandrescu
wrote:
> On 06/02/2016 03:34 PM, tsbockman wrote:
>> Your 'ö' examples will NOT work reliably with auto-decoded
>> code points,
>> and for nearly the same reason that they won't work with code
>> units; you
>> would have to use byGrapheme.
>
> They do work per spec: find this code point. It would be
> surprising if 'ö' were found but the string were positioned at
> a different code point.
Your examples will pass or fail depending on how (and whether)
the 'ö' grapheme is normalized. They only ever succeeds because
'ö' happens to be one of the privileged graphemes that *can* be
(but often isn't!) represented as a single code point. Many other
graphemes have no such representation.
Working directly with code points is sometimes useful anyway -
but then, working with code units can be, also. Neither will lead
to inherently "correct" Unicode processing, and in the absence of
a compelling context, your examples fall completely flat as an
argument for the inherent superiority of processing at the code
unit level.
>> The fact that you still don't get that, even after a dozen
>> plus attempts
>> by the community to explain the difference, makes you unfit to
>> direct
>> Phobos' Unicode support.
>
> Well there's gotta be a reason why my basic comprehension is
> under constant scrutiny whereas yours is safe.
Who said mine is safe? I *know* that I'm not qualified to be in
charge of this.
Your comprehension is under greater scrutiny because you are
proposing to overrule nearly all other active contributors
combined.
>> Please, either go study Unicode until you
>> really understand it, or delegate this issue to someone else.
>
> Would be happy to. To whom would I delegate?
If you're serious, I would suggest Dmitry Olshansky. He seems to
be our top Unicode expert, based on his contributions to
`std.uni` and `std.regex`. But, if he is unwilling/unsuitable for
some reason there are other candidates participating in this
thread (not me).
More information about the Digitalmars-d
mailing list