Why is BOM required to use unicode in tokens?
Paul Backus
snarwin at gmail.com
Tue Sep 15 02:23:31 UTC 2020
On Tuesday, 15 September 2020 at 01:49:13 UTC, James Blachly
wrote:
> I wish to write a function including ∂x and ∂y (these are
> trivial to type with appropriate keyboard shortcuts - alt+d on
> Mac), but without a unicode byte order mark at the beginning of
> the file, the lexer rejects the tokens.
>
> It is not apparently easy to insert such marks (AFAICT no
> common tool does this specifically), while other languages work
> fine (i.e., accept unicode in their source) without it.
>
> Is there a downside to at least presuming UTF-8?
According to the spec [1] this should Just Work. I'd recommend
filing a bug.
[1] https://dlang.org/spec/lex.html#source_text
More information about the Digitalmars-d-learn
mailing list