Ascii matters
Jonathan M Davis
jmdavisProg at gmx.com
Wed Aug 22 20:03:48 PDT 2012
On Wednesday, August 22, 2012 19:52:10 Sean Kelly wrote:
> I'm clearly missing something. ASCII and UTF-8 are compatible. What's
> stopping you from just processing these as if they were UTF-8 strings?
Range-based functions will treat arrays of char or wchar as forward ranges of
dchar. Because of the variable length of their code points, they aren't
considered to have length, be random access, or have slicing and will not
generally work with range-based functions which require any of those
operations (though some range-based functions do specialize on strings and use
those operations where they can based on proper understanding of unicode).
On the other hand, if you have a string that specifically holds ASCII and you
know that it only holds ASCII, you know that you can safely use length, random
access, and slicing as if each code unit were a full code point. But the
range-based functions don't know that your string is guaranteed to be ASCII-
only, so they continue to treat it as a range of dchar rather than char. The
solution is to either create a wrapper range whose element type is char or to
cast the char[] to ubyte[]. And Bearophile wants such a wrapper range to be
added to Phobos.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list