string is rarely useful as a function argument
foobar
foo at bar.com
Wed Dec 28 14:12:18 PST 2011
On Wednesday, 28 December 2011 at 21:17:49 UTC, Timon Gehr wrote:
>
> I was educated enough not to make that mistake, because I read
> the entire language specification before deciding the language
> was awesome and downloading the compiler. I find it strange
> that the product should be made less usable because we do not
> expect users to read the manual. But it is of course a valid
> point.
>
That's awfully optimistic to expect people to read the manual.
> There is nothing wrong with operating at the code unit level.
> Efficient slicing is very desirable.
>
I agree that it's useful. It is however the incorrect abstraction
level when you need a "string" which is by far the common case in
user code. i.e. if I need a name variable in a class: codeUnit[]
name; // bug!
string Name; // correct
I expect that most uses of code-unit arrays should be in the
standard library anyway since it provides the string manipulation
routines. It all boils down to making the common case trivial and
the rare case possible. You can use the underlying data structure
(code units) if you need it but the default "string" is what
people expect when thinking about what such a type does (a string
of letters). D's already 80% there since Phobos already treats
strings as bi-directional ranges of code-points which is much
closer to the mental image of a string of letters, so I think
this is about bringing the current design to its final conclusion.
>
> Exactly. It is acting less and less like an array of code
> units. But it *is* an array of code units. If the general
> consensus is that we need a string data type that acts at a
> different abstraction level by default (with which I'd
> disagree, but apparently I don't have a popular opinion here),
> then we need a string type in the standard library to do that.
> Changing the language so that an array of code units stops
> behaving like an array of code units is not a solution.
>
I agree that we should not break T[] for any T and instead
introduce a library type. While I personally believe that such a
change will expose hidden bugs (certainly when unaware
programmers treat string as ASCII and the product is later on
localized), it's a big disturbance in people's code and it's
worth a consideration if the benefit worth the costs. Perhaps,
some middle ground could be found such that existing code can
rely on existing behavior and the new library type will be an
opt-in.
More information about the Digitalmars-d
mailing list