Accented Characters and Counting Syllables

"Nordlöw" via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Dec 6 14:37:17 PST 2014


Given the fact that

     static assert("é".length == 2);

I was surprised that

     static assert("é".byCodeUnit.length == 2);
     static assert("é".byCodePoint.length == 2);

Isn't there a way to iterate over accented characters (in my case 
UTF-8) in D? Or is this an inherent problem in Unicode? I need 
this in a syllable counting algorithm that needs to distinguish 
accented and non-accented variants of vowels. For example café (2 
syllables) compared to babe (one syllable.


More information about the Digitalmars-d-learn mailing list