Why is BOM required to use unicode in tokens?
wjoe
invalid at example.com
Wed Sep 16 15:01:54 UTC 2020
On Tuesday, 15 September 2020 at 01:49:13 UTC, James Blachly
wrote:
> I wish to write a function including ∂x and ∂y (these are
> trivial to type with appropriate keyboard shortcuts - alt+d on
> Mac), but without a unicode byte order mark at the beginning of
> the file, the lexer rejects the tokens.
>
> It is not apparently easy to insert such marks (AFAICT no
> common tool does this specifically), while other languages work
> fine (i.e., accept unicode in their source) without it.
>
> Is there a downside to at least presuming UTF-8?
As you probably already know BOM means byte order mark so it is
only relevant for multi byte encodings (UTF-16, UTF-32). A BOM
for UTF-8 isn't required an in fact it's discouraged.
Your editor should automatically insert a BOM if appropriate when
you save your file. Probably you need to select the appropriate
encoding for your file. Typically this is available in the 'Save
as..' dialog, or the settings.
More information about the Digitalmars-d-learn
mailing list