VLERange: a range in between BidirectionalRange and RandomAccessRange

Gerrit Wichert gwichert at yahoo.com
Fri Jan 14 12:54:19 PST 2011


Am 14.01.2011 15:34, schrieb Steven Schveighoffer:
>
> Is it common to have multiple modifiers on a single character?  The
> problem I see with using decomposed canonical form for strings is that
> we would have to return a dchar[] for each 'element', which severely
> complicates code that, for instance, only expects to handle English.
>
> I was hoping to lazily transform a string into its composed canonical
> form, allowing the (hopefully rare) exception when a composed
> character does not exist.  My thinking was that this at least gives a
> useful string representation for 90% of usages, leaving the remaining
> 10% of usages to find a more complex representation (like your Text
> type).  If we only get like 20% or 30% there by making dchar the
> element type, then we haven't made it useful enough.
>
I'm afraid that this is not a proper way to handle this problem. It may
be better for a language not to 'translate' by default.
If the user wants to convert the codepoints this can be requested on
demand. But pemature default conversion is a subltle way to lose
information that may be important.
Imagine we want to write a tool for dealing with the in/output of some
other ignorant legacy software. Even if it is only text files, that
software may choke on some converted input. So i belive that it is very
importent that we are able to reproduce strings in exact that form in
which we have read them in.   

Gerrit


More information about the Digitalmars-d mailing list