The Case Against Autodecode

Marco Leise via Digitalmars-d digitalmars-d at puremagic.com
Thu May 12 16:52:52 PDT 2016


Am Thu, 12 May 2016 13:15:45 -0700
schrieb Walter Bright <newshound2 at digitalmars.com>:

> 7. Autodecode cannot be used with unicode path/filenames, because it is legal 
> (at least on Linux) to have invalid UTF-8 as filenames.

More precisely they are byte strings with '/' reserved to
separate path elements. While on an out-of-the-box Linux
nowadays everything is typically presented as UTF-8, there are
still die-hards that use code pages, corrupted file systems
or incorrectly bound network shares displaying with the wrong
charset. It is safer to work with them as a ubyte[] and that
also bypasses auto-decoding.

I'd like 'string' to mean valid UTF-8 in D as far as the
encoding goes. A filename should not be a 'string'.

-- 
Marco



More information about the Digitalmars-d mailing list