Program logic bugs vs input/environmental errors

Marco Leise via Digitalmars-d digitalmars-d at puremagic.com
Sun Oct 5 08:44:31 PDT 2014


Am Sat, 04 Oct 2014 13:12:43 -0700
schrieb Walter Bright <newshound2 at digitalmars.com>:

> On 10/4/2014 3:30 AM, Steven Schveighoffer wrote:
> > On 10/4/14 4:47 AM, Walter Bright wrote:
> >> On 9/29/2014 8:13 AM, Steven Schveighoffer wrote:
> >>> I can think of cases where it's programmer error, and cases where it's
> >>> user error.
> >>
> >> More carefully design the interfaces if programmer error and input error
> >> are conflated.
> >>
> >
> > You mean more carefully design File's ctor? How so?
> 
> You can start with deciding if random  binary data passed as a "file name" is 
> legal input to the ctor or not.

In POSIX speak [1] a file name consisting only of A-Za-z0-9.,-
is a "character string" (a portable file name) whereas anything
not representable in all locales is just a "string".
Locales' charsets are required to be able to represent
A-Za-z0-9.,- but may use a different mapping than ASCII for
that. Only the slash '/' must have a fixed value of 0x2F.

From that I conclude, that File() should open files by ubyte[]
exclusively to be POSIX compliant.

This is the stuff that's frustrating me much about POSIX. It
practically makes it impossible to write correct code. Even Qt
and Gtk+ settled for the system locale and UTF-8 respectively
as the assumed I/O charset for all file names, although each
file system could be mounted in a different charset. E.g.
CD-ROMs in ISO charset.
Windows does much better by offering Unicode versions on top
of the "ANSI" functions.
The only fix I see for POSIX is to deprecate all other locales
except UTF-8 at some point.

[1]
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_267

-- 
Marco



More information about the Digitalmars-d mailing list