The Case Against Autodecode

Walter Bright via Digitalmars-d digitalmars-d at puremagic.com
Fri May 13 06:34:09 PDT 2016


On 5/13/2016 3:43 AM, Marc Schütz wrote:
> On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
>> 7. Autodecode cannot be used with unicode path/filenames, because it is legal
>> (at least on Linux) to have invalid UTF-8 as filenames. It turns out in the
>> wild that pure Unicode is not universal - there's lots of dirty Unicode that
>> should remain unmolested, and autocode does not play with that.
>
> This just means that filenames mustn't be represented as strings; it's unrelated
> to auto decoding.

It means much more than that, filenames are just an example. I recently fixed 
MicroEmacs (my text editor) to assume the source is UTF-8, and display Unicode 
characters. But it still needs to work with dirty UTF-8 without throwing 
exceptions, modifying the text in-place, or other tantrums.


More information about the Digitalmars-d mailing list