The Case Against Autodecode

Wed Jun 1 09:41:39 PDT 2016

On 06/01/2016 10:29 AM, Andrei Alexandrescu wrote:
> On 06/01/2016 06:25 AM, Marc Schütz wrote:
>> On Tuesday, 31 May 2016 at 21:01:17 UTC, Andrei Alexandrescu wrote:
>>>
>>> The point is to operate on representation-independent entities
>>> (Unicode code points) instead of low-level representation-specific
>>> artifacts (code units).
>>
>> _Both_ are low-level representation-specific artifacts.
>
> Maybe this is a misunderstanding. Representation = how things are laid
> out in memory. What does associating numbers with various Unicode
> symbols have to do with representation? -- Andrei
>

As has been explained countless times already, code points are a non-1:1 
internal representation of graphemes. Code points don't exist for their 
own sake, their entire existence is purely as a way to encode graphemes. 
Whether that technically qualifies as "memory representation" or not is 
irrelevant: it's still a low-level implementation detail of text.