string is rarely useful as a function argument

foobar foo at bar.com
Wed Dec 28 14:12:18 PST 2011


On Wednesday, 28 December 2011 at 21:17:49 UTC, Timon Gehr wrote:
>
> I was educated enough not to make that mistake, because I read 
> the entire language specification before deciding the language 
> was awesome and downloading the compiler. I find it strange 
> that the product should be made less usable because we do not 
> expect users to read the manual. But it is of course a valid 
> point.
>

That's awfully optimistic to expect people to read the manual.

> There is nothing wrong with operating at the code unit level. 
> Efficient slicing is very desirable.
>

I agree that it's useful. It is however the incorrect abstraction 
level when you need a "string" which is by far the common case in 
user code. i.e. if I need a name variable in a class: codeUnit[] 
name; // bug!
string Name; // correct

I expect that most uses of code-unit arrays should be in the 
standard library anyway since it provides the string manipulation 
routines. It all boils down to making the common case trivial and 
the rare case possible. You can use the underlying data structure 
(code units) if you need it but the default "string" is what 
people expect when thinking about what such a type does (a string 
of letters). D's already 80% there since Phobos already treats 
strings as bi-directional ranges of code-points which is much 
closer to the mental image of a string of letters, so I think 
this is about bringing the current design to its final conclusion.

>
> Exactly. It is acting less and less like an array of code 
> units. But it *is* an array of code units. If the general 
> consensus is that we need a string data type that acts at a 
> different abstraction level by default (with which I'd 
> disagree, but apparently I don't have a popular opinion here), 
> then we need a string type in the standard library to do that. 
> Changing the language so that an array of code units stops 
> behaving like an array of code units is not a solution.
>

I agree that we should not break T[] for any T and instead 
introduce a library type. While I personally believe that such a 
change will expose hidden bugs (certainly when unaware 
programmers treat string as ASCII and the product is later on 
localized), it's a big disturbance in people's code and it's 
worth a consideration if the benefit worth the costs. Perhaps, 
some middle ground could be found such that existing code can 
rely on existing behavior and the new library type will be an 
opt-in.


More information about the Digitalmars-d mailing list