identifiers & "unialpha"
Sean Kelly
sean at f4.ca
Fri Sep 22 08:48:42 PDT 2006
Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> http://www.digitalmars.com/d/lex.html#identifier
> # Identifiers start with a letter, _, or universal alpha, and are followed
> # by any number of letters, _, digits, or universal alphas. Universal
> # alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the
> # C99 Standard.)
>
> Why is D referencing "ISO/IEC 9899:1999 (E) Appendix D" for defining
> "universal alpha"? "ISO/IEC 9899:1999 (E) Appendix D" isn't listing
> "universal alpha".
>
> Sample:
> \u00B7 (MIDDLE DOT, Other_Punctuation) isn't an "universal alpha" but
> allowed by Appendix D in identifiers.
>
> "ISO/IEC 9899:1999 (E) Appendix D" itself is referencing
> "ISO/IEC TR 10176:1998" for the character data. I strongly suggest to
> drop the redirection via "Appendix D" and use
> "ISO/IEC TR 10176 (current)" instead of the dated version
> "ISO/IEC TR 10176:1998". The 1998 version didn't yet include quite a
> chunk of CJK and Math characters that can be found in the current version.
Agreed. Incidentally, the 2003 revision to the C++ standard ("ISO/IEC
14882:2003(E)"), Appendix E, contains a revised copy of the character
table (which is likely from "ISO/IEC TR 10176:2003") and appears to have
done away with the "special characters" section entirely. So I suspect
your suggestion would eliminate the problem you mention above as well?
Sean
More information about the Digitalmars-d
mailing list