Top 5
Benji Smith
dlanguage at benjismith.net
Sat Oct 11 12:26:19 PDT 2008
Sergey Gromov wrote:
> Sat, 11 Oct 2008 14:46:55 -0400,
> Benji Smith wrote:
>> And, btw, you *can't* scan bytewise through a D string to find space
>> characters, because the value '32' can occur as the
>> least-significant-byte in a multi-byte non-whitespace character. Any
>> code that iterates bytewise through a char[] array is fundamentally broken.
>
> You're wrong. char[] is not MBCS, it's UTF-8. In UTF-8 any byte which
> is part of a multi-byte sequence always has its most significant bit
> set. You can safely search for any ASCII in UTF-8 sequence as if it
> were an array of bytes.
Oh yeah. I totally forgot about that. Good point.
--benji
More information about the Digitalmars-d
mailing list