Is º an unicode alphabetic character?

Ali Çehreli via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Sep 11 21:04:22 PDT 2014


On 09/11/2014 08:04 PM, AsmMan wrote:

 > what's an unicode alphabetic character?

Alphabetic is defined as Lu + Ll + Lt + Lm + Lo + Nl + Other_Alphabetic, 
all of which are explained here:

   http://www.unicode.org/Public/5.1.0/ucd/UCD.html#General_Category_Values

 > I misunderstood isAlpha(), I
 > used to think it's to validate letters like a, b, è, é .. z etc but
 > isAlpha('º') from std.uni module return true.

º happens to be in the "Letter, Lowercase" category so yes, it is isAlpha().

 > How can I validate only
 > the letters of an unicode alphabet in D or should I write one?

There are so many alphabets in the world. It is likely that a Unicode 
character will be a part of one.

 > I know I can do:
 >
 > bool is_id(dchar c)
 > {
 >      return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'z' || c >= 0xc0;
 > }

There is a misunderstanding. There are so many Unicode characters that 
are >= 0xc0 but not a part of the Alphabetic category. For example: ← 
(U+2190 LEFTWARDS ARROW).

Ali



More information about the Digitalmars-d-learn mailing list