The Case Against Autodecode
Walter Bright via Digitalmars-d
digitalmars-d at puremagic.com
Fri May 27 16:26:20 PDT 2016
On 5/27/2016 11:27 AM, Andrei Alexandrescu wrote:
> On 5/27/16 1:11 PM, Walter Bright wrote:
>> They mean code units.
>
> Always valid or potentially invalid as well? -- Andrei
Some years ago I would have said always valid. Experience, however, says that
Unicode is often dirty and code should be tolerant of that.
Consider Unicode in a text editor. You can't have it throwing exceptions,
silently changing things to replacement characters, etc., when there's a few
invalid sequences in it. You also can't just say "the file isn't Unicode" and
refuse to display the Unicode in it.
It isn't hard to deal with invalid Unicode in a user friendly manner.
More information about the Digitalmars-d
mailing list