Is º an unicode alphabetic character?

AsmMan via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Sep 11 23:38:49 PDT 2014


On Friday, 12 September 2014 at 04:04:22 UTC, Ali Çehreli wrote:
> On 09/11/2014 08:04 PM, AsmMan wrote:
>
> > what's an unicode alphabetic character?
>
> Alphabetic is defined as Lu + Ll + Lt + Lm + Lo + Nl + 
> Other_Alphabetic, all of which are explained here:
>
>   
> http://www.unicode.org/Public/5.1.0/ucd/UCD.html#General_Category_Values
>
> > I misunderstood isAlpha(), I
> > used to think it's to validate letters like a, b, è, é .. z
> etc but
> > isAlpha('º') from std.uni module return true.
>
> º happens to be in the "Letter, Lowercase" category so yes, it 
> is isAlpha().
>
> > How can I validate only
> > the letters of an unicode alphabet in D or should I write one?
>
> There are so many alphabets in the world. It is likely that a 
> Unicode character will be a part of one.
>
> > I know I can do:
> >
> > bool is_id(dchar c)
> > {
> >      return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'z' || c
> >= 0xc0;
> > }
>
> There is a misunderstanding. There are so many Unicode 
> characters that are >= 0xc0 but not a part of the Alphabetic 
> category. For example: ← (U+2190 LEFTWARDS ARROW).
>
> Ali

If I want ASCII and latin only alphabet which range should I use?
ie, how should I rewrite is_id() function?


More information about the Digitalmars-d-learn mailing list