The Case Against Autodecode

Tobias M via Digitalmars-d digitalmars-d at puremagic.com
Sun May 29 06:42:57 PDT 2016


On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote:
> On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu 
> via Digitalmars-d wrote:
>> On 5/27/16 3:10 PM, ag0aep6g wrote:
>> > I don't think there is value in distinguishing by language. 
>> > The point of Unicode is that you shouldn't need to do that.
>> 
>> It seems code points are kind of useless because they don't 
>> really mean anything, would that be accurate? -- Andrei
>
> That's what we've been trying to say all along! :-P  They're a 
> kind of low-level Unicode construct used for building "real" 
> characters, i.e., what a layperson would consider to be a 
> "character".

Code points are *the fundamental unit* of unicode. AFAIK most 
(all?) algorithms in the unicode spec are defined in terms of 
code points.
Sure, some algorithms also work on the code unit level. That can 
be used as an optimization, but they are still defined on code 
points.

Code points are also abstracting over the different 
representations (UTF-...), providing a uniform "interface".


More information about the Digitalmars-d mailing list