[Issue 24190] New: Identifier tokenizer is greedy steals new line characters
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Wed Oct 18 00:03:16 UTC 2023
https://issues.dlang.org/show_bug.cgi?id=24190
Issue ID: 24190
Summary: Identifier tokenizer is greedy steals new line
characters
Product: D
Version: D2
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P1
Component: dmd
Assignee: nobody at puremagic.com
Reporter: alphaglosined at gmail.com
Currently, the tokenizer for identifiers is quite greedy. It'll steal the
non-ASCII character for new lines when it should probably defer to the outer
loop to error.
```d
$ cat lsps.d
void main ()
{
enum b = 8;
mixin ("enum a1 =\u2028b; pragma (msg, a1);");
mixin ("enum a2\u2028= b; pragma (msg, a2);");
mixin ("enum\u2028a3 = b; pragma (msg, a3);");
}
$ dmd lsps.d
8
lsps.d-mixin-5(5): Error: char 0x2028 not allowed in identifier
lsps.d-mixin-6(6): Error: char 0x2028 not allowed in identifier
```
That character 0x2028 is a valid new line character.
--
More information about the Digitalmars-d-bugs
mailing list