The Case Against Autodecode
Marc Schütz via Digitalmars-d
digitalmars-d at puremagic.com
Fri May 13 03:49:24 PDT 2016
On Thursday, 12 May 2016 at 23:16:23 UTC, H. S. Teoh wrote:
> Therefore, autodecoding actually only produces intuitively
> correct results when your string has a 1-to-1 correspondence
> between grapheme and code point. In general, this is only true
> for a small subset of languages, mainly a few common European
> languages and a handful of others. It doesn't work for Korean,
> and doesn't work for any language that uses combining
> diacritics or other modifiers. You need byGrapheme to have the
> correct results.
In fact, even most European languages are affected if NFD
normalization is used, which is the default on MacOS X.
And this is actually the main problem with it: It was introduced
to make unicode string handling correct. Well, it doesn't,
therefore it has no justification.
More information about the Digitalmars-d
mailing list