The Case Against Autodecode

Fri May 27 12:10:50 PDT 2016

On 05/27/2016 08:42 PM, Andrei Alexandrescu wrote:
> Which languages are covered by code points, and which languages require
> graphemes consisting of multiple code points? How does normalization
> play into this? -- Andrei

I don't think there is value in distinguishing by language. The point of 
Unicode is that you shouldn't need to do that.

I think there are scripts that use combining characters extensively, but 
Unicode also has stuff like combining arrows. Those can make sense in an 
otherwise plain English text.

For example: 'a' + U+20D7 = a⃗.

There is no combined character for that, so normalization can't do 
anything here.