Corner cases in Primary Type Syntax and its proof-of-concept implementation

Fri Oct 11 18:19:01 UTC 2024

Here is the [current 
draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md).

As I see it, there are three corner cases because there are 
keywords that by themselves have one meaning and followed by 
something in parentheses, have a different meaning:
- `align` and `align` `(` *AssignExpression* `)` ― Default 
alignment or specify alignment
- `scope` and `scope` `(` *Identifier* `)` ― Storage 
class/attribute or scope guard
- `extern` and `extern` `(` *Tokens* `)` ― Storage class or 
linkage attribute

For any of these, if the former is followed by the kind of 
*BasicType* syntax the DIP introduces, there is an ambiguity 
because it could also be the latter:
```d
align (X) x = x0;  // `align` + type `(X)` (intentionally no “or”)
scope (Y) y = y0;  // `scope` + type `(Y)` or `scope guard `Y`?
extern (Z) z = z0; // `extern` + type `(Z)` (intentionally no 
“or”)
```

My implementation handles these somewhat differently for various 
reasons.

For `align`, it does nothing special. If a parenthesis follows, 
it’s treated as specifying alignment. If that isn’t what the 
programmer intended, but instead a type, it leads to a parse 
error with certainty (as far as I can tell) because even if `X` 
turns out semantically as something that’s both an integer *and* 
a type – which some weird things can (see below) –, it’s treated 
as the argument to `align` and the declared `x` lacks a type. 
Because `align` is not a storage class (it’s only an attribute), 
`align(X)` is no replacement for `auto`. The DIP doesn’t change 
the existing semantic meaning of anything. Because default 
alignment can be explicitly specified, there’s a local fix to the 
problem if `(X)` was meant as a type. Ideally, the compiler could 
recognize this kind of error and tell the programmer to use 
`align((X).alignof) (X) x = x0`. Possibly, I can add 
`align(auto)` as an alternative way to say `align` with no 
arguments so there’s a simpler and DRY way.

For `extern`, it’s similar to `align`, just that the semantic 
analysis couldn’t distinguish the meaning of the linkage 
`extern(Z)` form an `extern` attribute applied to a declaration 
of an object of type `(Z)`. If `Z` was meant to be a type, the 
declaration lacks a type, which as with `align` is an error as 
linkage cannot be used instead of `auto`. The programmer either 
wanted `extern(Z) auto z` or `extern Z z`. Practically speaking, 
if `Z` is `C`, `D`, `System`, or `Windows`, the parser cannot 
distinguish what the programmer really wanted. Semantic analysis 
can if `Z` is not in scope as a type. I did not implement 
something this, though.

For `scope` my implementation recognizes scope guards 
specifically and it treats `scope` `(`*Tokens*`)` as 
`scope` `BasicType` unless it’s in statement scope and *Tokens* 
is exactly `exit`, `success`, or `failure`.

In the monthly meeting today, the latter was seen as contentious 
as it does not allow for adding any new scope guards.

D’s syntax of linkages and scope guards was designed to flexible 
to allow for either implementation-defined nice-to-read linkages 
(e.g. `C++` and `Objective-C` in DMD) or the addition of 
nice-to-read scope guards.

Walter convinced me that the implementation limits the 
flexibility of future scope guards. I suggested that if *Tokens* 
is exactly one identifier (or one single token), it could be a 
scope guard (and be an error if it’s not a valid scope guard), 
otherwise a `BasicType` (and be an error if it isn’t).

This is for two reasons: No-one on the meeting could imagine a 
scope guard that cannot be a single identifier (or keyword if 
need be) and in that case, if a type clashed with a scope guard, 
removing the parentheses around the token would always work as 
the types that require parentheses to express with this DIP are 
always longer than one token.

There are these solutions:
- Convince Walter that multi-token scope guards are never needed.
- Change the implementation and disallow `scope` `(`*Tokens*`)` 
unless it forms a valid scope guard (which currently means 
*Tokens* must be `exit`, `success`, or `failure`, but it could be 
more in the future).

The second option is easy to implement in theory, but in practice 
wouldn’t allow the following, reasonable declaration that is 
specifically enabled by this DIP:
```d
int x;
void f()
{
     scope (ref int delegate()) dg = () => x;
}
```
If it were not allowed, then it becomes a *reasonable* error to 
make in the context of the goal of this DIP, and DMD should point 
out how to correctly declare `dg`. (This is similar to how DMD 
diagnoses C array declarations: It recognizes them and tells the 
programmer how to declare the array in D style, and it should do 
the same with `dg`.) The only question is, *what* should DMD 
suggest? Or what would you suggest a programmer to do who’s 
asking this as a forum question?

To be honest, I don’t know. That’s why I’m asking.

- "Use `scope dg`" is infers a different type than the one 
specified (one with attributes).
- "Remove `scope` and let the compiler infer it." If `scope` 
would be inferred, nothing changes, but if it wasn’t, maybe 
something changes meaning down the road.
- "Use `scope @0 (ref int delegate()) dg`." This has the problem 
that adding a UDA can change behavior. An empty UDA sequence 
`@()` would work, but empty UDA sequences are expressly not 
allowed.
- "Use `scope typeof((ref int delegate()).init) dg`" would work, 
but is ugly and hacky.

IMO, just close the door on *some* multi-token scope guards, 
namely those that would otherwise be types. It is not likely to 
be an issue. As with linkage, `Objective-C` is not a type and 
`C++` is also never a type.

---

```d
// currently working D code:
struct Weird
{
     static int opIndex() @safe => 8;
}

void main() @safe
{
     Weird[] weirds;
     align(Weird[]) int x;
     static assert(x.alignof == 8); // passes
}
```