Top 5

Oskar Linde oskar.lindeREM at OVEgmail.com
Sat Oct 11 08:19:50 PDT 2008


Benji Smith wrote:

> The current state of affairs, where strings are transparently just 
> arrays of UTF-8 bytes makes them impossible to work with. They're 
> unindexable, unsliceable. You can't operate directly on those arrays. 
> You're forced to use the phobos/tango functions (which, by the way, are 
> incompatible with one another).

I disagreed with you the last time you said this too. As far as I see 
it, there is nothing to gain from making strings objects. I've done 
quite a lot of string processing in D including non-latin text and never 
found a problem slicing or indexing char[]s.

> If D strings must be character arrays, I'd love for them to at least be 
> ordinary arrays. Each element of a char[] array should be a single 
> character. And if a sizeof(char) == 1, then a char should be limited to 
> a single byte. To represent mutlibyte characters, it should be necessary 
> to use a wchar[] or dchar[] array.

And you would be back to the horrors of the pre-Unicode world with 
incompatible code-pages and encodings. Not even dchars can represent 
single Unicode characters, since they can be composed of combining 
character sequences.

There is nothing wrong with D's way of handling strings. You just need 
to lose the preconception of needing atomic characters.

-- 
Oskar



More information about the Digitalmars-d mailing list