[Issue 3455] Some Unicode characters not allowed in identifiers

d-bugmail at puremagic.com d-bugmail at puremagic.com
Fri Oct 30 09:51:11 PDT 2009


http://d.puremagic.com/issues/show_bug.cgi?id=3455


Matti Niemenmaa <matti.niemenmaa+dbugzilla at iki.fi> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |spec
                 CC|                            |matti.niemenmaa+dbugzilla at i
                   |                            |ki.fi
           Platform|Other                       |All
         OS/Version|Linux                       |All
           Severity|normal                      |enhancement


--- Comment #1 from Matti Niemenmaa <matti.niemenmaa+dbugzilla at iki.fi> 2009-10-30 09:51:09 PDT ---
As http://www.digitalmars.com/d/1.0/lex.html#identifier very clearly states,
the allowed characters in identifiers are those defined in the C99 standard,
ISO/IEC 9899:1999(E) Annex D. Have a look at it:
http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf

9, code point 0xff19, is not in that list. The maximum one is 0xd7a3, in fact. 
This is not a bug, this is an enhancement.

However, rather than an arbitrary and frozen list, I /would/ prefer basing it
simply on Unicode properties, such as Java's choice: identifiers may start with
letters or numeric letters, and may contain, in addition to those, connecting
punctuation, decimal digits, and combining and non-spacing marks. In other
words:

Identifiers may start with code points from the general categories Ll, Lm, Lo,
Lt, Lu, Nl.

Identifiers may contain code points from the general categories Ll, Lm, Lo, Lt,
Lu, Mc, Mn, Nd, Nl, No, Pc.

Java also allows Cc and Cf, of whose usefulness I'm not so convinced. These are
control characters and things like "soft hyphen", which isn't even supposed to
be displayed unless the word line-wraps. Too much potential for confusion IMHO.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list