Why is BOM required to use unicode in tokens?

Dominikus Dittes Scherkl dominikus at scherkl.de
Wed Sep 16 07:38:26 UTC 2020


On Wednesday, 16 September 2020 at 00:22:15 UTC, Steven 
Schveighoffer wrote:

> Someone should verify that the character you want to use for a 
> symbol name is actually considered a letter or not. Using 
> phobos to prove this is kind of self-defeating, as I'm pretty 
> sure it would be in league with DMD if there is a bug.

UnicodeData.txt (a data file provided by the unicode organization 
itself since version 1)

contains exactly the necessary properties (in an easy parsable 
format), so we don't need to hard-code the list of allowed 
identifier characters, but can instead use the latest version 
provided by unicode (changing every year!). We only need to 
define which properties a character need to be allowed in an 
identifier.


More information about the Digitalmars-d-learn mailing list