VLERange: a range in between BidirectionalRange and RandomAccessRange
Lutger Blijdestijn
lutger.blijdestijn at gmail.com
Sat Jan 15 02:03:20 PST 2011
Nick Sabalausky wrote:
> "Andrei Alexandrescu" <SeeWebsiteForEmail at erdani.org> wrote in message
> news:ignon1$2p4k$1 at digitalmars.com...
>>
>> This may sometimes not be what the user expected; most of the time they'd
>> care about the code points.
>>
>
> I dunno, spir has succesfuly convinced me that most of the time it's
> graphemes the user cares about, not code points. Using code points is just
> as misleading as using UTF-16 code units.
I agree. This is a very informative thread, thanks spir and everybody else.
Going back to the topic, it seems to me that a unicode string is a
surprisingly complicated data structure that can be viewed from multiple
types of ranges. In the light of this thread, a dchar doesn't seem like such
a useful type anymore, it is still a low level abstraction for the purpose
of correctly dealing with text. Perhaps even less useful, since it gives the
illusion of correctness for those who are not in the know.
The algorithms in std.string can be upgraded to work correctly with all the
issues mentioned, but the generic ones in std.algorithm will just subtly do
the wrong thing when presented with dchar ranges. And, as I understood it,
the purpose of a VleRange was exactly to make generic algorithms just work
(tm) for strings.
Is it still possible to solve this problem or are we stuck with specialized
string algorithms? Would it work if VleRange of string was a bidirectional
range with string slices of graphemes as the ElementType and indexing with
code units? Often used string algorithms could be specialized for
performance, but if not, generic algorithms would still work.
More information about the Digitalmars-d
mailing list