Why is BOM required to use unicode in tokens?

Steven Schveighoffer schveiguy at gmail.com
Wed Sep 16 00:22:15 UTC 2020


On 9/15/20 8:10 PM, James Blachly wrote:
> On 9/15/20 10:59 AM, Steven Schveighoffer wrote:
>>> Thanks to Paul, Jon, Dominikus and H.S. for thoughtful responses.
>>>
>>> What will it take (i.e. order of difficulty) to get this fixed -- 
>>> will merely a bug report (and PR, not sure if I can tackle or not) do 
>>> it, or will this require more in-depth discussion with compiler 
>>> maintainers?
>>
>> I'm thinking your issue will not be fixed (just like we don't allow 
>> $abc to be an identifier). But the spec can be fixed to refer to the 
>> correct standards.
>>
> 
> Steve: It sounds as if the spec is correct but the glyph (codepoint?) 
> range is outdated. If this is the case, it would be a worthwhile update. 
> Do you really think it would be rejected out of hand?
> 

I don't really know the answer, as I'm not a unicode expert.

Someone should verify that the character you want to use for a symbol 
name is actually considered a letter or not. Using phobos to prove this 
is kind of self-defeating, as I'm pretty sure it would be in league with 
DMD if there is a bug.

But if it's not a letter, then it would take more than just updating the 
range. It would be a change in the philosophy of what constitutes an 
identifier name.

-Steve


More information about the Digitalmars-d-learn mailing list