string is rarely useful as a function argument

Timon Gehr timon.gehr at gmx.ch
Fri Dec 30 14:27:11 PST 2011


On 12/30/2011 10:36 PM, deadalnix wrote:
> Le 30/12/2011 20:55, Timon Gehr a écrit :
>> On 12/30/2011 08:33 PM, Joshua Reusch wrote:
>>> Am 29.12.2011 19:36, schrieb Andrei Alexandrescu:
>>>> On 12/29/11 12:28 PM, Don wrote:
>>>>> On 28.12.2011 20:00, Andrei Alexandrescu wrote:
>>>>>> Oh, one more thing - one good thing that could come out of this
>>>>>> thread
>>>>>> is abolition (through however slow a deprecation path) of s.length
>>>>>> and
>>>>>> s[i] for narrow strings. Requiring s.rep.length instead of s.length
>>>>>> and
>>>>>> s.rep[i] instead of s[i] would improve the quality of narrow strings
>>>>>> tremendously. Also, s.rep[i] should return ubyte/ushort, not
>>>>>> char/wchar.
>>>>>> Then, people would access the decoding routines on the needed
>>>>>> occasions,
>>>>>> or would consciously use the representation.
>>>>>>
>>>>>> Yum.
>>>>>
>>>>>
>>>>> If I understand this correctly, most others don't. Effectively, .rep
>>>>> just means, "I know what I'm doing", and there's no change to existing
>>>>> semantics, purely a syntax change.
>>>>
>>>> Exactly!
>>>>
>>>>> If you change s[i] into s.rep[i], it does the same thing as now.
>>>>> There's
>>>>> no loss of functionality -- it's just stops you from accidentally
>>>>> doing
>>>>> the wrong thing. Like .ptr for getting the address of an array.
>>>>> Typically all the ".rep" everywhere would get annoying, so you would
>>>>> write:
>>>>> ubyte [] u = s.rep;
>>>>> and use u from then on.
>>>>>
>>>>> I don't like the name 'rep'. Maybe 'raw' or 'utf'?
>>>>> Apart from that, I think this would be perfect.
>>>>
>>>> Yes, I mean "rep" as a short for "representation" but upon first sight
>>>> the connection is tenuous. "raw" sounds great.
>>>>
>>>> Now I'm twice sorry this will not happen...
>>>>
>>>
>>> Maybe it could happen if we
>>> 1. make dstring the default strings type --
>>
>> Inefficient.
>>
>>> code units and characters would be the same
>>
>> Wrong.
>>
>>> or 2. forward string.length to std.utf.count and opIndex to
>>> std.utf.toUTFindex
>>
>> Inconsistent and inefficient (it blows up the algorithmic complexity).
>>
>>>
>>> so programmers could use the slices/indexing/length (no lazyness
>>> problems), and if they really want codeunits use .raw/.rep (or better
>>> .utf8/16/32 with std.string.representation(std.utf.toUTF8/16/32)
>>>
>>
>> Anyone who intends to write efficient string processing code needs this.
>> Anyone who does not want to write string processing code will not need
>> to index into a string -- standard library functions will suffice.
>>
>>> But generally I liked the idea of just having an alias for strings...
>>
>> Me too. I think the way we have it now is optimal. The only reason we
>> are discussing this is because of fear that uneducated users will write
>> code that does not take into account Unicode characters above code point
>> 0x80. But what is the worst thing that can happen?
>>
>
> ATOS origin was hacked because of bad management of unicode in string in
> some of their software.

And cast(string)s.rep[i..j] would magically fix all those bugs?

>
> Consequences can be more importants than you may think.
>
> Additionnaly, you make an asumption that is realy wrong : an educated
> programmer will not make mistake.

I am not. I am just assuming that the proposed change does not help with 
that.

> C programmers will just tell you
> excactly the same thing is the discution comes to pointers. But the fact
> is, we all do mistakes. Many of them ! We should go into unsafe
> behaviour, that rely on programmer capabilities only when needed.
>
> I do understand pointers. I do make mistake with them and it does have
> crazy consequences sometime. And I do not trust anyone that say me
> he/she doesn't.
>
> The #1 quality of a programmer is to act like he/she is a morron.
> Because sometime we all are morrons.

The #1 quality of a programmer is to write correct code. If he/she acts 
as if he/she is a moron, he/she will write code that acts like a moron. 
Simple as that.


More information about the Digitalmars-d mailing list