Should this work?
Marco Leise
Marco.Leise at gmx.de
Fri Jan 10 11:03:12 PST 2014
Am Fri, 10 Jan 2014 18:07:54 +0100
schrieb Jacob Carlborg <doob at me.com>:
> On 2014-01-10 17:01, Marco Leise wrote:
>
> > Sorry, I got confused with the Unicode definitions. I see now
> > that a grapheme cluster is e.g. \r\n. What I really meant is
> > that Phobos needs to support graphemes. But seeing that
> > monsters like this exist: n͠g, I don't even know if this is
> > one character or two, but right now Phobos sees it as three
> > characters.
>
> Thunderbird sees that as two characters. Ruby sees it as three.
I think this is the (or one of the) official documents about
where a "user-perceived character" ends:
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules
According to this, the above n͠g is indeed defined as 2
characters. Ruby is just no better than Phobos :p
»Grapheme cluster boundaries are important for collation,
regular expressions, UI interactions (such as mouse selection,
arrow key movement, backspacing), segmentation for vertical
text, identification of boundaries for first-letter styling,
and counting “character” positions within text.«
--
Marco
More information about the Digitalmars-d
mailing list