Proposal for fixing dchar ranges

John Colvin john.loughran.colvin at gmail.com
Mon Mar 10 13:54:34 PDT 2014


On Monday, 10 March 2014 at 20:00:07 UTC, Steven Schveighoffer 
wrote:
> On Mon, 10 Mar 2014 15:30:00 -0400, John Colvin 
> <john.loughran.colvin at gmail.com> wrote:
>
>> On Monday, 10 March 2014 at 18:09:51 UTC, Steven Schveighoffer 
>> wrote:
>>>
>>> Because one can slice out a multi-code-unit code point, one 
>>> cannot access it via index. Strings would be horribly 
>>> crippled without slicing. Without indexing, they are fine.
>>>
>>> A possibility is to allow index, but actually decode the code 
>>> point at that index (error on invalid index). That might 
>>> actually be the correct mechanism.
>>>
>>
>> In order to be correct, both require exactly the same 
>> knowledge: The beginning of a code point, followed by the end 
>> of a code point. In the indexing case they just happen to be 
>> the same code-point and happen to be one code unit from each 
>> other. I don't see how one is any more or less errror-prone or 
>> fundamentally wrong than the other.
>
> Using indexing, you simply cannot get the single code unit that 
> represents a multi-code-unit code point. It doesn't fit in a 
> char. It's guaranteed to fail, whereas slicing will give you 
> access to the all the data in the string.
>

I think I understand your motivation now. Indexing never provides 
anything that slicing doesn't do more generally.

> Now, with indexing actually decoding a code point, one can 
> alias a[i] to a[i..$].front(), which means decode the first 
> code point you come to at index i. This means indexing is 
> slow(er), and returns a dchar. I think as a first step, that 
> might be too much to add silently. I'd rather break it first, 
> then add it back later.
>
> -Steve

Of course that i has to be at the beginning of a code-point. 
Doesn't seem like that useful a feature and potentially very 
confusing for people who naively expect normal indexing.


More information about the Digitalmars-d mailing list