Why is BOM required to use unicode in tokens?

Dominikus Dittes Scherkl dominikus at scherkl.de
Wed Sep 16 13:16:39 UTC 2020


On Wednesday, 16 September 2020 at 07:38:26 UTC, Dominikus Dittes 
Scherkl wrote:
> We only need to define which properties a character need to be 
> allowed in an identifier.

I think the following change in the grammar would be sufficient:

Identifier:
     IdentifierStart
     IdentifierStart IdentifierChars

IdentifierChars:
     IdentifierChar
     IdentifierChar IdentifierChars

IdentifierStart:
     _
     Any Unicode codepoint with general category Lu, Ll, Lt, Lo, 
Nl or No

IdentifierChar:
     IdentifierStart
     Any Unicode codepoint with general category Lm, Mn, Me, Mc or 
Nd




More information about the Digitalmars-d-learn mailing list