Should this work?

Marco Leise Marco.Leise at gmx.de
Fri Jan 10 11:03:12 PST 2014


Am Fri, 10 Jan 2014 18:07:54 +0100
schrieb Jacob Carlborg <doob at me.com>:

> On 2014-01-10 17:01, Marco Leise wrote:
> 
> > Sorry, I got confused with the Unicode definitions. I see now
> > that a grapheme cluster is e.g. \r\n. What I really meant is
> > that Phobos needs to support graphemes. But seeing that
> > monsters like this exist: n͠g, I don't even know if this is
> > one character or two, but right now Phobos sees it as three
> > characters.
> 
> Thunderbird sees that as two characters. Ruby sees it as three.

I think this is the (or one of the) official documents about
where a "user-perceived character" ends:

http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules

According to this, the above n͠g is indeed defined as 2
characters. Ruby is just no better than Phobos :p


»Grapheme cluster boundaries are important for collation,
 regular expressions, UI interactions (such as mouse selection,
 arrow key movement, backspacing), segmentation for vertical
 text, identification of boundaries for first-letter styling,
 and counting “character” positions within text.«

-- 
Marco



More information about the Digitalmars-d mailing list