VLERange: a range in between BidirectionalRange and RandomAccessRange

Sat Jan 15 02:03:20 PST 2011

Nick Sabalausky wrote:

> "Andrei Alexandrescu" <SeeWebsiteForEmail at erdani.org> wrote in message
> news:ignon1$2p4k$1 at digitalmars.com...
>>
>> This may sometimes not be what the user expected; most of the time they'd
>> care about the code points.
>>
> 
> I dunno, spir has succesfuly convinced me that most of the time it's
> graphemes the user cares about, not code points. Using code points is just
> as misleading as using UTF-16 code units.

I agree. This is a very informative thread, thanks spir and everybody else. 

Going back to the topic, it seems to me that a unicode string is a 
surprisingly complicated data structure that can be viewed from multiple 
types of ranges. In the light of this thread, a dchar doesn't seem like such 
a useful type anymore, it is still a low level abstraction for the purpose 
of correctly dealing with text. Perhaps even less useful, since it gives the 
illusion of correctness for those who are not in the know.

The algorithms in std.string can be upgraded to work correctly with all the  
issues mentioned, but the generic ones in std.algorithm will just subtly do 
the wrong thing when presented with dchar ranges. And, as I understood it, 
the purpose of a VleRange was exactly to make generic algorithms just work 
(tm) for strings. 

Is it still possible to solve this problem or are we stuck with specialized 
string algorithms? Would it work if VleRange of string was a bidirectional  
range with string slices of graphemes as the ElementType and indexing with 
code units? Often used string algorithms could be specialized for 
performance, but if not, generic algorithms would still work.