[Issue 1357] Cannot use FFFF and FFFE in Unicode escape sequences.

Wed Oct 3 06:57:56 PDT 2007

http://d.puremagic.com/issues/show_bug.cgi?id=1357

------- Comment #13 from smjg at iname.com  2007-10-03 08:57 -------
(In reply to comment #12)
> You're basically right, that's just my attitude towards types: if 
> it can be outside the [type.min,type.max] range it shouldn't be 
> stored in type.  It's like storing 119 in a bool just because it's 
> a byte and not a bit of data.  You can do it, but you shouldn't.  
> If there's a possibility that the data is malformed, you should 
> store it in a meaning-agnostic type like ubyte/uint.

True up to a point.  But out-of-range data could just as easily be due to a bug
in the program - it makes little sense to use a meaning-agnostic type just to
steer clear of this possibility.  Half the point of the UTF validation
functions is to check for bugs.

> Much of the problem is D's character types, which really should be 
> called something like "utf8", "utf16", and "utf32".  It annoys me 
> to no end that the C standard library purportedly understands 
> something about UTF-8: the C string type should be ubyte*, not 
> char*.  But that's just me.

If we're going to change this, toStringz should return a ubyte* as well.

--