The Case Against Autodecode

Thu Jun 2 13:47:12 PDT 2016

On Thursday, 2 June 2016 at 20:36:12 UTC, Andrei Alexandrescu 
wrote:
> On 06/02/2016 04:33 PM, ag0aep6g wrote:
>> Operating on code points by default is seen as not 
>> particularly useful.
>
> By whom? The "support level 1" folks yonder at the Unicode 
> standard? :o) -- Andrei

 From the standard:

> Level 1 support works well in many circumstances. However, it 
> does not handle more complex languages or extensions to the 
> Unicode Standard very well. Particularly important cases are 
> surrogates, canonical equivalence, word boundaries, grapheme 
> boundaries, and loose matches. (For more information about 
> boundary conditions, see The Unicode Standard, Section 5-15.)
>
> Level 2 support matches much more what user expectations are 
> for sequences of Unicode characters. It is still locale 
> independent and easily implementable. However, the 
> implementation may be slower when supporting Level 2, and some 
> expressions may require Level 1 matches. Thus it is usually 
> required to have some sort of syntax that will turn Level 2 
> support on and off.

That doesn't sound like much of an endorsement for defaulting to 
only level 1 support to me - "it does not handle more complex 
languages or extensions to the Unicode Standard very well".