string is rarely useful as a function argument

Timon Gehr timon.gehr at gmx.ch
Sun Jan 1 05:19:58 PST 2012


On 01/01/2012 08:10 AM, Don wrote:
> On 31.12.2011 17:13, Timon Gehr wrote:
>> On 12/31/2011 01:15 PM, Don wrote:
>>> On 31.12.2011 01:56, Timon Gehr wrote:
>>>> On 12/31/2011 01:12 AM, Andrei Alexandrescu wrote:
>>>>> On 12/30/11 6:07 PM, Timon Gehr wrote:
>>>>>> alias std.string.representation raw;
>>>>>
>>>>> I meant your implementation is incomplete.
>>>>
>>>> It was more a sketch than an implementation. It is not even type safe
>>>> :o).
>>>>
>>>>>
>>>>> But the main point is that presence of representation/raw is not the
>>>>> issue.
>>>>> The availability of good-for-nothing .length and operator[] are
>>>>> the issue. Putting in place the convention of using .raw is hardly
>>>>> useful within the context.
>>>>>
>>>>
>>>> D strings are arrays. An array without .length and operator[] is close
>>>> to being good for nothing. The language specification is quite clear
>>>> about the fact that e.g. char is not a character but an utf-8 code
>>>> unit.
>>>> Therefore char[] is an array of code units.
>>>
>>> No, it isn't. That's the problem. char[] is not an array of char.
>>> It has an additional invariant: it is a UTF8 string. If you randomly
>>> change elements, the invariant is violated.
>>
>> char[] is an array of char and the additional invariant is not enforced
>> by the language.
>
> No, it isn't an ordinary array. For example with concatenation. char[] ~
> int will never create an invalid string.

Yes it will.

void main() {
     char[] x;
     writeln(x~255);
}

> You can end up with multiple chars being appended, even from a single append. foreach is different,
> too. They are a bit magical.

Fair enough, but type conversion rules are a bit magical in general.

void main() {
     auto a = cast(short[])[1,2,3];
     auto b = [1,2,3];
     auto c = cast(short[])b;
     assert(a!=c);
}

> There's quite a lot of code in the compiler to make sure that strings
> remain valid.
>

At the same time, there are many language features that allow to create 
invalid strings.

auto a = "\377\252\314";
auto b = x"FF AA CC";
auto c = import("binary");

> The additional invariant is not enforced in the case of slicing; that's
> the point.



More information about the Digitalmars-d mailing list