Top 5

Sergey Gromov snake.scaly at gmail.com
Sat Oct 11 12:25:04 PDT 2008


Sat, 11 Oct 2008 14:46:55 -0400,
Benji Smith wrote:
> And, btw, you *can't* scan bytewise through a D string to find space 
> characters, because the value '32' can occur as the 
> least-significant-byte in a multi-byte non-whitespace character. Any 
> code that iterates bytewise through a char[] array is fundamentally broken.

You're wrong.  char[] is not MBCS, it's UTF-8.  In UTF-8 any byte which 
is part of a multi-byte sequence always has its most significant bit 
set.  You can safely search for any ASCII in UTF-8 sequence as if it 
were an array of bytes.



More information about the Digitalmars-d mailing list