identifiers & "unialpha"

Sean Kelly sean at f4.ca
Fri Sep 22 08:48:42 PDT 2006


Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> http://www.digitalmars.com/d/lex.html#identifier
> # Identifiers start with a letter, _, or universal alpha, and are followed
> # by any number of letters, _, digits, or universal alphas. Universal
> # alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the
> # C99 Standard.)
> 
> Why is D referencing "ISO/IEC 9899:1999 (E) Appendix D" for defining
> "universal alpha"? "ISO/IEC 9899:1999 (E) Appendix D" isn't listing
> "universal alpha".
> 
> Sample:
> \u00B7 (MIDDLE DOT, Other_Punctuation) isn't an "universal alpha" but
> allowed by Appendix D in identifiers.
> 
> "ISO/IEC 9899:1999 (E) Appendix D" itself is referencing
> "ISO/IEC TR 10176:1998" for the character data. I strongly suggest to
> drop the redirection via "Appendix D" and use
> "ISO/IEC TR 10176 (current)" instead of the dated version
> "ISO/IEC TR 10176:1998". The 1998 version didn't yet include quite a
> chunk of CJK and Math characters that can be found in the current version.

Agreed.  Incidentally, the 2003 revision to the C++ standard ("ISO/IEC 
14882:2003(E)"), Appendix E, contains a revised copy of the character 
table (which is likely from "ISO/IEC TR 10176:2003") and appears to have 
done away with the "special characters" section entirely.  So I suspect 
your suggestion would eliminate the problem you mention above as well?


Sean



More information about the Digitalmars-d mailing list