Corner cases in Primary Type Syntax and its proof-of-concept implementation
Quirin Schroll
qs.il.paperinik at gmail.com
Fri Oct 11 18:19:01 UTC 2024
Here is the [current
draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md).
As I see it, there are three corner cases because there are
keywords that by themselves have one meaning and followed by
something in parentheses, have a different meaning:
- `align` and `align` `(` *AssignExpression* `)` ― Default
alignment or specify alignment
- `scope` and `scope` `(` *Identifier* `)` ― Storage
class/attribute or scope guard
- `extern` and `extern` `(` *Tokens* `)` ― Storage class or
linkage attribute
For any of these, if the former is followed by the kind of
*BasicType* syntax the DIP introduces, there is an ambiguity
because it could also be the latter:
```d
align (X) x = x0; // `align` + type `(X)` (intentionally no “or”)
scope (Y) y = y0; // `scope` + type `(Y)` or `scope guard `Y`?
extern (Z) z = z0; // `extern` + type `(Z)` (intentionally no
“or”)
```
My implementation handles these somewhat differently for various
reasons.
For `align`, it does nothing special. If a parenthesis follows,
it’s treated as specifying alignment. If that isn’t what the
programmer intended, but instead a type, it leads to a parse
error with certainty (as far as I can tell) because even if `X`
turns out semantically as something that’s both an integer *and*
a type – which some weird things can (see below) –, it’s treated
as the argument to `align` and the declared `x` lacks a type.
Because `align` is not a storage class (it’s only an attribute),
`align(X)` is no replacement for `auto`. The DIP doesn’t change
the existing semantic meaning of anything. Because default
alignment can be explicitly specified, there’s a local fix to the
problem if `(X)` was meant as a type. Ideally, the compiler could
recognize this kind of error and tell the programmer to use
`align((X).alignof) (X) x = x0`. Possibly, I can add
`align(auto)` as an alternative way to say `align` with no
arguments so there’s a simpler and DRY way.
For `extern`, it’s similar to `align`, just that the semantic
analysis couldn’t distinguish the meaning of the linkage
`extern(Z)` form an `extern` attribute applied to a declaration
of an object of type `(Z)`. If `Z` was meant to be a type, the
declaration lacks a type, which as with `align` is an error as
linkage cannot be used instead of `auto`. The programmer either
wanted `extern(Z) auto z` or `extern Z z`. Practically speaking,
if `Z` is `C`, `D`, `System`, or `Windows`, the parser cannot
distinguish what the programmer really wanted. Semantic analysis
can if `Z` is not in scope as a type. I did not implement
something this, though.
For `scope` my implementation recognizes scope guards
specifically and it treats `scope` `(`*Tokens*`)` as
`scope` `BasicType` unless it’s in statement scope and *Tokens*
is exactly `exit`, `success`, or `failure`.
In the monthly meeting today, the latter was seen as contentious
as it does not allow for adding any new scope guards.
D’s syntax of linkages and scope guards was designed to flexible
to allow for either implementation-defined nice-to-read linkages
(e.g. `C++` and `Objective-C` in DMD) or the addition of
nice-to-read scope guards.
Walter convinced me that the implementation limits the
flexibility of future scope guards. I suggested that if *Tokens*
is exactly one identifier (or one single token), it could be a
scope guard (and be an error if it’s not a valid scope guard),
otherwise a `BasicType` (and be an error if it isn’t).
This is for two reasons: No-one on the meeting could imagine a
scope guard that cannot be a single identifier (or keyword if
need be) and in that case, if a type clashed with a scope guard,
removing the parentheses around the token would always work as
the types that require parentheses to express with this DIP are
always longer than one token.
There are these solutions:
- Convince Walter that multi-token scope guards are never needed.
- Change the implementation and disallow `scope` `(`*Tokens*`)`
unless it forms a valid scope guard (which currently means
*Tokens* must be `exit`, `success`, or `failure`, but it could be
more in the future).
The second option is easy to implement in theory, but in practice
wouldn’t allow the following, reasonable declaration that is
specifically enabled by this DIP:
```d
int x;
void f()
{
scope (ref int delegate()) dg = () => x;
}
```
If it were not allowed, then it becomes a *reasonable* error to
make in the context of the goal of this DIP, and DMD should point
out how to correctly declare `dg`. (This is similar to how DMD
diagnoses C array declarations: It recognizes them and tells the
programmer how to declare the array in D style, and it should do
the same with `dg`.) The only question is, *what* should DMD
suggest? Or what would you suggest a programmer to do who’s
asking this as a forum question?
To be honest, I don’t know. That’s why I’m asking.
- "Use `scope dg`" is infers a different type than the one
specified (one with attributes).
- "Remove `scope` and let the compiler infer it." If `scope`
would be inferred, nothing changes, but if it wasn’t, maybe
something changes meaning down the road.
- "Use `scope @0 (ref int delegate()) dg`." This has the problem
that adding a UDA can change behavior. An empty UDA sequence
`@()` would work, but empty UDA sequences are expressly not
allowed.
- "Use `scope typeof((ref int delegate()).init) dg`" would work,
but is ugly and hacky.
IMO, just close the door on *some* multi-token scope guards,
namely those that would otherwise be types. It is not likely to
be an issue. As with linkage, `Objective-C` is not a type and
`C++` is also never a type.
---
```d
// currently working D code:
struct Weird
{
static int opIndex() @safe => 8;
}
void main() @safe
{
Weird[] weirds;
align(Weird[]) int x;
static assert(x.alignof == 8); // passes
}
```
More information about the Digitalmars-d
mailing list