Unicode handling comparison
Dmitry Olshansky
dmitry.olsh at gmail.com
Wed Nov 27 12:28:44 PST 2013
27-Nov-2013 20:22, Wyatt пишет:
> On Wednesday, 27 November 2013 at 16:18:34 UTC, Wyatt wrote:
>>
>> trouble following all that (e.g. Isn't "noe\u0308l" a grapheme
>>
> Whoops, overzealous pasting. That is, "e\u0308", which composes to
> "ë". A grapheme cluster seems to represent one printed character: "...a
> horizontally segmentable unit of text, consisting of some grapheme base
> (which may consist of a Korean syllable) together with any number of
> nonspacing marks applied to it."
>
> Is that about right?
As much as standard defines it. (actually they talk about boundaries,
and grapheme is what happens to be in between).
More specifically D's std.uni follows the notion of the extended
grapheme cluster. There is no need to stick with ugly legacy crap.
See also
http://www.unicode.org/reports/tr29/
>
> -Wyatt
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list