[Issue 1357] Cannot use FFFF and FFFE in Unicode escape sequences.

Wed Oct 3 14:44:50 PDT 2007

http://d.puremagic.com/issues/show_bug.cgi?id=1357

------- Comment #15 from smjg at iname.com  2007-10-03 16:44 -------
(In reply to comment #14)
>> If we're going to change this, toStringz should return a ubyte* as well.
> 
> There's very little chance that such a change will occur. To make it useful,
> char (or preferably 'utf8') should implicitly cast to ubyte, or using e.g.
> string literals would be a pain.

Only when trying to call C functions.  But even then, we wouldn't need to go as
far as that.  Just add ubyte* to the list of types that a string literal can
serve as.

> Many programs and libraries, in particular
> Phobos and Tango, would also have to make a lot of changes just to compile.
> Plus, it'd be another inconsistency between C and D: C 'char' would map to D
> 'ubyte'.

As I read from comment 12 that you were already proposing.

But is it really an inconsistency?  Really, all that's happened is that C's
signed char has been renamed as byte, and C's unsigned char as ubyte.  It's no
more inconsistent than unsigned int being renamed uint, and long long being
renamed long.

The names 'byte' and 'ubyte' better reflect how C's char types tend to be used:
- as a code unit in an arbitrary 8-bit character encoding
- to hold a byte-sized integer value of arbitrary semantics (though APIs that
do this often define an alias of char to make this clearer)
which is more or less how D programmers are using byte/ubyte, and how ISTM you
think they should be used.

--