Proposal for fixing dchar ranges
John Colvin
john.loughran.colvin at gmail.com
Mon Mar 10 13:54:34 PDT 2014
On Monday, 10 March 2014 at 20:00:07 UTC, Steven Schveighoffer
wrote:
> On Mon, 10 Mar 2014 15:30:00 -0400, John Colvin
> <john.loughran.colvin at gmail.com> wrote:
>
>> On Monday, 10 March 2014 at 18:09:51 UTC, Steven Schveighoffer
>> wrote:
>>>
>>> Because one can slice out a multi-code-unit code point, one
>>> cannot access it via index. Strings would be horribly
>>> crippled without slicing. Without indexing, they are fine.
>>>
>>> A possibility is to allow index, but actually decode the code
>>> point at that index (error on invalid index). That might
>>> actually be the correct mechanism.
>>>
>>
>> In order to be correct, both require exactly the same
>> knowledge: The beginning of a code point, followed by the end
>> of a code point. In the indexing case they just happen to be
>> the same code-point and happen to be one code unit from each
>> other. I don't see how one is any more or less errror-prone or
>> fundamentally wrong than the other.
>
> Using indexing, you simply cannot get the single code unit that
> represents a multi-code-unit code point. It doesn't fit in a
> char. It's guaranteed to fail, whereas slicing will give you
> access to the all the data in the string.
>
I think I understand your motivation now. Indexing never provides
anything that slicing doesn't do more generally.
> Now, with indexing actually decoding a code point, one can
> alias a[i] to a[i..$].front(), which means decode the first
> code point you come to at index i. This means indexing is
> slow(er), and returns a dchar. I think as a first step, that
> might be too much to add silently. I'd rather break it first,
> then add it back later.
>
> -Steve
Of course that i has to be at the beginning of a code-point.
Doesn't seem like that useful a feature and potentially very
confusing for people who naively expect normal indexing.
More information about the Digitalmars-d
mailing list