VLERange: a range in between BidirectionalRange and RandomAccessRange

Joel C. Salomon joelcsalomon at gmail.com
Sun Jan 23 07:11:33 PST 2011


On 01/14/2011 09:34 AM, Steven Schveighoffer wrote:
> Is it common to have multiple modifiers on a single character?  The
> problem I see with using decomposed canonical form for strings is that
> we would have to return a dchar[] for each 'element', which severely
> complicates code that, for instance, only expects to handle English.

Hebrew:
• Almost every letter in a printed Hebrew bible has at least one of—
  ‣ vowel marker (the Hebrew alphabet is otherwise consonantal) and
  ‣ a /dagesh/ dot, indicating the difference between /b/ & /v/, or
    between /mm/ and /m/;
• almost every word has at least one letter with a cantillation mark in
  addition to the above; and
• other marks too complicated & off-topic to explain.

Vietnamese uses Latin letters with accents playing multiple roles, so
there are often two or three accent marks on a single letter; e.g., the
name of the creator of pdfTeX is spelled “Hàn Thế Thành”, with two
accents on the “e”.

I’m sure there are others.

—Joel


More information about the Digitalmars-d mailing list