How to check i

spir via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Oct 16 12:43:03 PDT 2014


On 16/10/14 20:46, Uranuz via Digitalmars-d-learn wrote:
> I have some string *str* of unicode characters. The question is how to check if
> I have valid unicode code point starting at code unit *index*?
> [...]

You cannot do that without decoding. Cheking whether utf-x is valid and decoding 
are the very same process. IIRC, D has a validation func which is more or less 
just an alias for the decoding func ;-). Moreover, you also need to distinguish 
"word-character" code points from others (punctuation, spacing, etc) which 
requires unicode code points (Unicode the consortium provide tables for such tasks).

Thus, I would recommand you to just abandon the illusion of working at the level 
of code units for such tasks, and simply operate on strings of code points. (Why 
do you think D has them builtin?)

denis


More information about the Digitalmars-d-learn mailing list