D's confusing strings (was Re: D on hackernews)

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Wed Sep 21 09:39:03 PDT 2011


On 9/21/11 10:16 AM, Christophe wrote:
> Timon Gehr , dans le message (digitalmars.D:144889), a écrit :
>> unicode natively. Yet the 'D strings are strange and confusing' argument
>> comes up quite often on the web.
>
> Well, I think they are. The ptr+length stuff is amasing, but the
> behavior of strings in phobos is weird.
>
> mini-quiz: what should std.range.drop(some_string, 1) do ?
> hint: what it actually does is not what the documentation of phobos
> suggests*...
>
> Strings are array of char, but they appear like a lazy range of dchar to
> phobos. I could cope with the fact that this is a little unexpected for
> beginners. But well, that creates a lot of exceptions in phobos, like
> the fact that you can't even copy a char[] to a char[] with
> std.algorithm.copy. And I don't mention all the optimization that are
> not/cannot be performed for those strings. I'll just remember to use
> ubyte[] wherever I can...

String handling in D is good modulo the oddities you noticed. What would 
make it perfect would be:

* Add property .rep that returns byte[], ushort[], or uint[] for char[], 
wchar[], dchar[] respectively (with the appropriate qualifier).

* Replace .length with .codeUnits.

* Disallow [n] and [m .. n]

This would upgrade D's strings from good to awesome. Really it would be 
a dream come true. Unfortunately it would also break most D code there 
is out there. I don't see how we can improve the current situation while 
staying backward compatible.


Andrei


More information about the Digitalmars-d mailing list