Narrow string is not a random access range

mist none at none.none
Wed Oct 24 05:18:04 PDT 2012


On Wednesday, 24 October 2012 at 12:03:10 UTC, Jonathan M Davis 
wrote:
> On Wednesday, October 24, 2012 13:39:50 Timon Gehr wrote:
>> You realize that the proposed solution is that arrays of code 
>> units
>> would no longer be arrays of code units?
>
> Yes and no. They'd be arrays of code units, but any operations 
> on them which
> weren't unicode safe would require using the rep property. So, 
> for instance,
> using ptr on them to pass to C functions would be fine, but 
> slicing wouldn't.
> It definitely would be a case of violating the turtles all the 
> way down
> principle, because arrays of code units wouldn't really be 
> proper arrays
> anymore, but as long as they're treated as actual arrays, they 
> _will_ be
> misued. The trick is doing something that's both correct and 
> reasonably
> efficient by default but allows fully efficient code if you 
> code with an
> understanding of unicode, and to do that, you can't have arrays 
> of code units
> like we do now. But for better or worse, that doesn't look like 
> it's going to
> change.
>
> What we have right now actually works quite well if you 
> understand the issues
> involved, but it's not newbie friendly at all.
>
> - Jonathan M Davis

What about a compromise - turning this proposal upside down and 
requiring something like "utfstring".decode to operate on 
symbols? ( There is front & Co in std.array but I am thinking of 
more tightly coupled to string ) It would have removed necessity 
of copy-pasting the very same checks for all algorithms and move 
decision about usage of code points vs code units to user side. 
Yes, it is does not prohibit a lot if senseless operations, but 
at least it is consistent approach. I personally believe that not 
being able to understand what to await from basic 
algorithm/operation applied to string (without looking at lib 
source code) is much more difficult sitation then necessity to 
properly understand unicode.


More information about the Digitalmars-d-learn mailing list