The Case Against Autodecode

Tue May 31 06:15:27 PDT 2016

Am Tue, 31 May 2016 07:17:03 +0000
schrieb default0 <Kevin.Labschek at gmx.de>:

> Thinking about this a bit more - what algorithms are actually 
> correct when implemented on the level of code units?

Calculating the buffer size of a string, validation and
fast versions of general algorithms that can be defined in
terms of ASCII, like skipAsciiWhitespace(), splitByComma(),
splitByLineAscii().

> I would also think that if you know your strings are normalized 
> in the same normalization form (for example because they come 
> from the same normalized source), you can check two strings for 
> equality on the code unit level, but my understanding of unicode 
> is still quite lacking, so I'm not sure on that.

That's correct.

-- 
Marco