Proposal for fixing dchar ranges

Johannes Pfau nospam at example.com
Tue Mar 11 11:25:10 PDT 2014


Am Tue, 11 Mar 2014 14:02:26 -0400
schrieb "Steven Schveighoffer" <schveiguy at yahoo.com>:

> A previous poster brings up this incorrect code:
> 
> auto index = countUntil(str, "xyz");
> auto newstr = str[index..$];
> 
> But it can easily be done this way also:
> 
> auto index = indexOf(str, "xyz");
> auto codepts = walkLength(str[0..index]);
> auto newstr = str[index..$];
> 
> Given how D works, I think it would be very costly and near
> impossible to somehow make the incorrect slice operation statically
> rejected. One simply has to be trained what a code point is, and what
> a code unit is. HOWEVER, for the most part, nobody needs to care.
> Strings work fine without having to randomly access specific code
> points or slice based on them. Using indexes works just fine.
> 
> -Steve

Yes, you can workaround the count problem, but then it is not
"consistent across all uses of the string". What if the above code was
a generic template written for arrays? Then it silently fails for
strings and you have to special case it.

I think the problem here is that if ranges / algorithms have to work on
the same data type as slicing/indexing. If .front returns code units,
then indexing/slicing should be done with code units. If it returns
code points then slicing has to happen on code points for consistency
or it should be disallowed. (Slicing on code units is important - no
doubt. But it is error prone and should be explicit in some way:
string.sliceCP(a, b) or string.representation[a...b])


More information about the Digitalmars-d mailing list