The Case Against Autodecode

Tobias M via Digitalmars-d digitalmars-d at puremagic.com
Sun May 29 06:04:18 PDT 2016


On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote:
> I am pretty sure that a single grapheme in unicode does not 
> correspond to your notion of "character". I am pretty sure that 
> what you think of as a "character" is officially called 
> "Grapheme Cluster" not "Grapheme".

Grapheme is a linguistic term. AFAIUI, a grapheme cluster is a 
cluster of codepoints representing a grapheme. It's called 
"cluster" in the unicode spec, because there there is no 
dedicated grapheme unit.

I put "character" into quotes, because the term is not really 
well defined. I just used it for a short and pregnant answer. I'm 
sure there's a better/more correct definition of graphem/phoneme, 
but it's probably also much longer and complicated.


More information about the Digitalmars-d mailing list