string is rarely useful as a function argument
foobar
foo at bar.com
Wed Dec 28 22:45:46 PST 2011
On Wednesday, 28 December 2011 at 22:39:15 UTC, Timon Gehr wrote:
> On 12/28/2011 11:12 PM, foobar wrote:
>> On Wednesday, 28 December 2011 at 21:17:49 UTC, Timon Gehr
>> wrote:
>>>
>>> I was educated enough not to make that mistake, because I
>>> read the
>>> entire language specification before deciding the language
>>> was awesome
>>> and downloading the compiler. I find it strange that the
>>> product
>>> should be made less usable because we do not expect users to
>>> read the
>>> manual. But it is of course a valid point.
>>>
>>
>> That's awfully optimistic to expect people to read the manual.
>>
>
> Well, if the alternative is slowly butchering the language I
> will be awfully optimistic about it all day long.
>
>>> There is nothing wrong with operating at the code unit level.
>>> Efficient slicing is very desirable.
>>>
>>
>> I agree that it's useful. It is however the incorrect
>> abstraction level
>> when you need a "string" which is by far the common case in
>> user code.
>
> I would not go as far as to call it 'incorrect'.
>
>> i.e. if I need a name variable in a class: codeUnit[] name; //
>> bug!
>> string Name; // correct
>>
>
> From a pragmatic viewpoint it does not matter because if string
> is used like this, then codeUnit[] does exactly the same thing.
> Nobody forces anyone to index or slice into a string variable
> when they don't need that functionality. All engineers have to
> work with leaky abstractions. Why is it such a big deal?
>
>
>> I expect that most uses of code-unit arrays should be in the
>> standard
>> library anyway since it provides the string manipulation
>> routines. It
>> all boils down to making the common case trivial and the rare
>> case
>> possible. You can use the underlying data structure (code
>> units) if you
>> need it but the default "string" is what people expect when
>> thinking
>> about what such a type does (a string of letters). D's already
>> 80% there
>> since Phobos already treats strings as bi-directional ranges of
>> code-points which is much closer to the mental image of a
>> string of
>> letters, so I think this is about bringing the current design
>> to its
>> final conclusion.
>>
>
> Well, that mental image is just not the right one when dealing
> with Unicode.
>
>>>
>>> Exactly. It is acting less and less like an array of code
>>> units. But
>>> it *is* an array of code units. If the general consensus is
>>> that we
>>> need a string data type that acts at a different abstraction
>>> level by
>>> default (with which I'd disagree, but apparently I don't have
>>> a
>>> popular opinion here), then we need a string type in the
>>> standard
>>> library to do that. Changing the language so that an array of
>>> code
>>> units stops behaving like an array of code units is not a
>>> solution.
>>>
>>
>> I agree that we should not break T[] for any T and instead
>> introduce a
>> library type. While I personally believe that such a change
>> will expose
>> hidden bugs (certainly when unaware programmers treat string
>> as ASCII
>> and the product is later on localized), it's a big disturbance
>> in
>> people's code and it's worth a consideration if the benefit
>> worth the
>> costs. Perhaps, some middle ground could be found such that
>> existing
>> code can rely on existing behavior and the new library type
>> will be an
>> opt-in.
>
> What will such a type offer, except that it disallows indexing
> and slicing?
From a pragmatic view point people can also continue programming
in C++ instead of investing a lot of effort learning a new
language.
The only difference between programming languages is the human
interface aspect. Anything you can program with D you could also
do in assembly yet you prefer D because it's more convenient. In
that regard, a code-unit array is definitely worse than a string
type.
A programmer can choose to either change his 'naive' mental image
or change the programming language. Most will do the latter.
Computers need to adapt and be human friendly, not vice-versa.
More information about the Digitalmars-d
mailing list