Third and Hopefully Last Draft: Primary Type Syntax
Quirin Schroll
qs.il.paperinik at gmail.com
Mon Sep 23 19:03:47 UTC 2024
On Sunday, 22 September 2024 at 10:58:55 UTC, Tim wrote:
> On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll
> wrote:
>> The obligatory
>> [permalink](https://github.com/Bolpat/DIPs/blob/0562d0c1708f4f8bb79e72392218154ee39b1d4f/DIPs/DIP-2NNN-QFS.md) and [latest draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md)
>
> The grammar changes look good. I found some new ambiguities,
> but the implementation seems to always prefer the old meaning,
> so it should be no problem.
In general, ambiguities are resolved considering Maximum Munch:
If the next token can be parsed as part of the entity that the
grammar suggests, it will be; only if it can’t, the entity is
closed or it’s an error.
> ### Attributes with optional parens
>
> ```d
> // deprecated (size_t) x1 = 1; // Syntax error
> // align (size_t) x2 = 1; // Syntax error
> // package (size_t) x3 = 1; // Syntax error
> // extern (size_t) x4 = 1; // Syntax error
> struct UDA{}
> // @UDA (size_t) x5 = 1; // Syntax error
> ```
Those all fall under Maximum Munch: A parenthesis following any
of these attributes constitutes their optional arguments.
Attributes with optional arguments are greedy.
I wasn’t even aware of `align` without argument.
The biggest one is `extern` because it’s realistically used with
the new parsing. If you have a class `C`, `extern (C)` is
ambiguous – except for Maximum Munch.
> The attributes `deprecated`, `align`, `package` and `extern` as
> well as
> UDAs can be followed with optional arguments in parens, like the
> deprecation message. These parens are now ambiguous with a
> basic type in
> parens.
>
> The implementation seems to always try to parse the parens as
> arguments
> for the attribute, so it remains backward compatible.
Yes, and it follows MM, which is generally something programmers
can rely on.
What can be done about those? For one:
```d
attribute
{
declaration;
}
```
Always works at declaration scope, but for statement scope,
that’s not possible. Here, I thought one could use an empty UDA
list `@()`, but those are expressly illegal, so one has to resort
to using a dummy UDA like `@("")`. Not nice, but if you insist on
expressing something at statement scope in one swath, I guess we
can ask the programmer for some concessions.
> Maybe this could be confusing for the user, when a declaration
> uses a type in parens and later an attribute is added.
There’s unfortunately little that can be done about it. A better
implementation can possibly backtrack and re-interpret what used
to be an attribute’s argument as a basic type, but to be honest,
that is a lot of work.
> ### Scope guards
>
> ```d
> alias exit = Object;
> Object x1;
> void main()
> {
> scope (exit) x1 = new Object(); // Still a scope guard
> // scope (Object) x2 = new Object(); // Syntax error
> // scope (int) x3 = 3; // Syntax error
> @0 scope (exit) x4 = new Object(); // Declares variable
> with type exit
> }
> ```
The big issue with these is, basically, that IMO this _must_ work:
```d
scope (ref void function())* fpp = null;
```
And it doesn’t.
> The first statement is a scope guard with the current grammar.
> With the
> new grammar it could also be a variable declaration of type
> `exit` and
> storage class `scope`. The implementation still parses it as a
> scope
> guard, so it remains backward compatible.
IIRC, I ran into this and implemented a look-ahead to handle
scope guards correctly. The Scope guards utilize magic
identifiers, and unlike `__traits` or `pragma`, there is no-arg
`scope`.
> The next line could also be a variable declaration, but it is
> still
> parsed as a scope guard. DMD then prints an error, because
> `Object`
> is not a valid scope identifier. The line with `x3` is a syntax
> error
> for the same reason.
I just fixed that because it was fairly easy to do so. My
implementation now looks ahead to see if it’s
`scope(`exit/success/failure`)` and if it’s not, it tries to
parse it as `scope` attribute.
> The last statement is parsed as a variable declaration, because
> scope
> guards can't have UDAs.
This is interesting. It’s unlikely that something like that is
going to be a real-world problem, though, as it requires two
unlikely things: Someone naming a type `exit` and putting
parentheses around it and using a UDA on statement scope.
My fix from above doesn’t change that, but again, it’s really
unlikely to be in code anyways.
> ### Function literals
>
> ```d
> auto test1 = function (float){return 0;};
> // auto test2 = function (float)(int){return 0;}; // Syntax
> error
> ```
>
> Function literals have an optional return type and optional
> parameters.
> The type `float` for `test1` could be a parameter or a return
> type in
> parens. The implementation always parses the parens as
> parameters,
> so it remains backward compatible.
Yes, for backwards compatibility, it must be done that way.
However, this is a MM violation and must be mentioned in the DIP.
> The second function literal has both a return type and
> parameters, but
> it results in a syntax error, because the parens are parsed as
> parameters and no other parens are expected after that.
The second one should be allowed; otherwise some things aren’t
expressible. This should work because there’s no valid reason why
it can’t:
```d
auto fp = function (ref int function()) () => null;
```
However, this currently works and must keep behavior:
```d
auto fp = function (ref int function()) => null;
static assert(is(typeof(fp) : typeof(null) function(ref int
function())));
```
The implementation will do a look-ahead to figure out if it’s
seeing `(Params) FunctionLiteralBody` or `(Type)(params)
FunctionLiteralBody`.
It might be noteworthy that this is not a MM violation. There is
no other way to parse `(Type)(Parameters) FunctionLiteralBody`.
> ### Anonymous classes
>
> ```d
> void main()
> {
> auto o1 = new class (Object) {};
> }
> ```
>
> The parens could be constructor arguments or a basic type in
> `AnonBaseClassList?`. The implementation always tries to parse
> constructor arguments, which should be fine.
I going to look into this. Probably this is low-priority because
a base class or interface name following `new class` never
requires parens. But it should not be an error either. Probably
I’ll do the same as with function literals: Look ahead and see if
there’s another set of parens. If yes, it’s `new class
(Type)(Arguments) {}`. If not, it’s `new class /*implicit
Object*/(Arguments) {}` because of backwards compatibility.
I’ll commit my stuff probably tomorrow. I can’t do it now,
unfortunately.
More information about the dip.development
mailing list