D Language Foundation October 2024 Monthly Meeting Summary
Mike Parker
aldacron at gmail.com
Fri Mar 7 11:23:17 UTC 2025
The D Language Foundation's monthly meeting for October 2024 took
place on Friday the 11th. It lasted about an hour and twenty
minutes, though there was a long discussion at the end. I was
unable to attend, so Razvan ran the meeting and Dennis recorded
it for me. Quirin Schroll attended to discuss his Primary Type
Syntax DIP.
## The Attendees
The following people attended:
* Walter Bright
* Rikki Cattermole
* Jonathan M. Davis
* Timon Gehr
* Martin Kinkelin
* Dennis Korpel
* Mathias Lang
* Átila Neves
* Razvan Nitu
* Quirin Schroll
* Adam Wilson
## The Summary
### Primary Type Syntax DIP
Quirin had previously joined [our August monthly
meeting](https://forum.dlang.org/post/ldhsnvniyehrrfqhxjde@forum.dlang.org) to discuss his Primary Type Syntax DIP. At the time, it was in [its second round of feedback](https://forum.dlang.org/thread/zymqcnpjcpuphpeulhev@forum.dlang.org) in the DIP Development forum and he had a working implementation, but he wasn't sure if it was ready to move forward with Formal Assessment.
In that meeting, Walter had raised concerns about potential
grammar ambiguities related to the proposed use of parentheses.
He didn’t want D to suffer from a problem that C++ had, where in
some contexts something could be either an expression or a type.
At the time of this meeting, [the DIP was on its fourth
draft](https://forum.dlang.org/post/cekqyahwnumvesppxsfs@forum.dlang.org).
As part of the forum feedback, someone had tried their best to
find parsing issues and uncovered a few issues in the
implementation. For the most part, the fix was to do nothing. The
issues didn't cause anything weird to happen. They resulted in
parse errors, which meant the programmer would have to alter
their code to express what they wanted differently.
One example was `scope`. We have both the single attribute and
the scope guard, which is the keyword plus parentheses. He
mentioned `align` and `extern` as other examples. In each of
these cases, you could just rearrange the keywords to resolve the
error.
The sort of weirdness that happened in C++, where you could put
parentheses around the identifier that was being declared,
couldn't happen here. It would never parse in D.
Walter said there was an ambiguity in the C grammar where
`(identifier)` could initiate two completely different parses.
Quirin said that was impossible in D because when the parser
tried to match a declaration, it preferred to parse it as a type.
If you wanted the parser to treat it as an identifier, it
couldn't be in parentheses. Whatever was in parentheses could
never be the declared object.
Walter said that one ambiguity required a symbol table to resolve
in C and C++. He avoided it in D by requiring the `cast` keyword
to disambiguate it.
Quirin said his DIP didn't touch any of that. It did touch `cast`
indirectly in that it affected types, and you could have a type
in a cast, but it was completely inside the parentheses of the
cast.
He said `cast` was actually pretty nice because you could have
not only a basic type but a general type inside the parentheses.
There was no problem with that. The only parsing issues were with
attributes that had both a standalone form and a form with
parentheses.
For scope guards, he implemented a look ahead that recognized
`success`, `failure`, and `exit`. If the parser saw those in the
parentheses, then it knew it wasn't dealing with a type, but a
scope guard. If instead, it saw anything else in the parentheses,
it parsed it as a type. It couldn't be a scope guard in that case.
Walter asked what would happen if there were a type named `exit`.
Quirin said the parser would prefer to treat `scope(exit)` as a
scope guard in that case.
He said that in the current implementation, when the opening
parenthesis was found after `scope`, it was then treated as a
scope guard. Any unexpected identifier inside the parentheses
resulted in a parse error. His implementation instead did a look
ahead before deciding it was a scope guard because with this DIP
it could instead be a type. So it was more lenient.
He added that if you had a type called `exit` in current D, that
wasn't a declaration. In his implementation, it looked like a
declaration, but because it's one of the three possible
identifiers in a scope guard, it remained a scope guard.
Walter said it still sounded ambiguous to him.
Quirin agreed it was ambiguous, but reiterated that his
implementation resolved the ambiguity by treating it as a scope
guard if it found `success`, `failure`, or `exit` in parentheses
following `scope`. If it wasn't one of those, then hopefully it
was a type. The semantic analysis would find out.
Walter suggested another way to deal with it would be to look
beyond the closing parenthesis to see what comes after. Quirin
agreed the implementation could be smarter. Walter said it could
see if the rest of it parsed as a declaration or an expression.
Quirin said that was much more work and more expensive. Walter
agreed.
Quirin said his implementation wasn't trying to be perfect. It
was like a proof of concept. Walter said he wasn't faulting him
for it. He was just trying to think of ways to resolve the
ambiguities.
Timon said you couldn't resolve the ambiguities just by looking
further ahead. In the chat, he gave the example of `scope(exit)
foo`, which could be a paren-free call to `foo`. You could always
parse it as a declaration and do something else if the semantic
analysis figured out that it wasn't.
Quirin asked if there was something that could parse as a
declaration and then turn out not to be one. Timon said there
were so many kinds of declarations that the answer was probably
"yes". The `foo` example, without Quirin's disambiguation, could
parse as a declaration but was actually a scope guard.
Quirin said choosing to look ahead only for the scope guard
identifiers was the right thing to do. If instead, the parser
checked to see if `exit` was a valid type, then that would change
the meaning of existing D code. And that was not okay.
If you wanted your type named `exit` in parentheses, you could
write something between it and `scope` and it would be okay. It
was that easy.
Walter said these ambiguous cases should be clearly identified in
the DIP. Quirin said he didn't put it in the DIP because the DIP
specified maximal munch. He thought it wasn't noteworthy whenever
maximal munch did a great job at disambiguation because the
default behavior did what was intended. Some maximal munch
exceptions were described in detail in the DIP.
Walter didn't think maximal munch solved this. The implementation
was just deciding it was a scope guard when it saw `exit` in the
parentheses.
Rikki noted that when trying to solve something like this, it
also still had to work with a parser for a text editor or an IDE.
If they couldn't do it, then it wasn't a good solution.
Basically, just limit the cleverness.
Quirin said he had included something about syntax highlighting
in one of the drafts, but couldn't remember if it was still
there. He talked a bit about how some requirements for
highlighters were difficult in D without semantic analysis, and
some examples involving simple vs. complex highlighters. In
short, in answer to Rikki's question, he said simple syntax
highlighters should have no problem. They would recognize the
`scope` keyword and there parentheses, and then do what they did.
He then mentioned `extern`. For its use as a linkage attribute,
e.g., `extern(C)`, the specification only required `C`.
Everything else, including `Windows`, was implementation-defined.
Walter said the idea was that any identifier could appear within
the parentheses. Quirin said that arbitrary tokens could also be
between them.
There was some discussion then about specific details of what the
parser currently accepted and what it rejected with regards to
`extern`. Then Martin said he would like to see a requirement
that when specifying a linkage attribute, an opening parenthesis
must be required immediately following `extern`, with nothing in
between, to more clearly distinguish it from the `extern` storage
class.
Quirin mentioned an earlier DIP draft had included text that made
whitespace between a type constructor and the opening parenthesis
significant. He ended up removing it because it was too hard for
simple syntax highlighters, as they didn't like semantic
whitespace. He then went into some detail about it to make the
point that potential problems that existed with `const`,
whitespace, parentheses, and types didn't really exist with
`extern`.
He said the issue with parentheses starting a basic type was not
specific to the Primary Types DIP. Any DIP proposing tuples had
the same issue. Timon disagreed because tuples were signified by
commas, so his tuples proposal didn't have that problem. He said
it could be solved in the way Quririn was describing, but his
proposal didn't do that.
Walter said that `scope` and `extern` could be specified such
that they couldn't be followed by a type. Dennis thought that was
best. Using `exit`, `failure`, and `success` to disambiguate
would mean we couldn't add something new without breaking
anything that used the new thing as a type name.
Walter agreed and said the syntax of `scope` and `extern` was
specifically designed to allow anything in the parentheses so
that we could extend them in the future. He suggested the
implementation be changed so that when it sees an opening
parenthesis after `extern` or `scope`, then it should never treat
anything in the parentheses as a type. That would then be
forward-compatible. He thought it a reasonable solution.
Quirin said that `scope` and `extern` were different in that
regard. `scope` was always four tokens long, with three of the
tokens nailed down, and the remaining token was always one of
three possible identifiers: `exit`, `success`, and `failure`. For
all intents and purposes, a scope guard was effectively a single
token. It was easy to recognize and distinguish from something
that wasn't a scope guard. It would only ever have an identifier
in the parentheses.
Walter repeated that the syntax was intended to allow anything in
the parentheses to enable future extensions. Quirin gave the
following example:
```d
scope (ref int function()) fp = null;
```
He asked how he would then distinguish this from a scope guard.
Walter repeated that it was all about allowing for future
extensions. Quirin said he couldn't imagine wanting anything in a
scope guard other than an identifier.
Walter said he couldn't think of an example at the moment, but
that wasn't the point. He didn't think it was a terrible
limitation to say that a scope guard that didn't have a type in
the parentheses was its own separate entity. He couldn't see how
that harmed Quirin's proposal or would compromise it.
Quirin said it probably wouldn't.
Timon said the following did not work in his unpacking branch:
```d
extern (a,b) = tuple(1,2);
```
He had no problem with `scope`, only `extern`, because it just
ate everything.
Quirin said the question to answer was what the programmer could
do when they wanted an `extern` or `scope` variable instead of a
linkage attribute or a scope guard. He thought the typical
solution with `scope` would just be to rely on inference. You
could alias the type, or you could put something in between the
`scope` and the opening parenthesis. Even a comment or a UDA
would work. The `scope` would then be unambiguous because it was
no longer followed by an opening parenthesis. But it would only
work in declaration scope, not statement scope.
He asked if you could have meaningful `extern` variables in
statement scope. Martin said you definitely couldn't initialize
those. They were for forward declarations. Quirin said he didn't
know how to resolve it for local variables. Martin said he had no
idea if it was feasible to declare a global `extern` variable in
function scope.
Jonathan said that made no sense semantically. `extern` should be
at the global level. Quirin said it was allowed. He had tried it.
Maybe we should disallow it. Jonathan said at the very least the
parser should ignore it. It made no sense.
Martin agreed that made no sense. He said he wouldn't like to see
a special case for `scope` here. Quirin's current approach to
disambiguate the scope guard was simple. It should be fine. If we
did need to extend scope guards in the future, we could then
change the implementation to disambiguate differently as needed.
Walter preferred the simpler way: just have `scope(` mean it was
a scope guard.
Quirin came back to his previous example:
```d
scope (ref int function()) fp = null;
```
He said a good compiler would decide this wasn't a scope guard.
It wasn't explicitly allowed, but it could be parsed. The
programmer could then get an error, where the compiler was
saying, "Hey, I know what this is, I know what you want, but
because we want flexibility and extendability with scope guards,
I need you to explicitly do *this* to let me know you really
didn't want to write a scope guard."
He asked Walter what the *this* in the message should be. What
should the programmer put between the `scope` and the opening
parenthesis?
Walter said the solution was to use an alias:
```d
alias X = ref int function();
scope X fp = null;
```
Átila agreed. Jonathan said that as ugly as it was, there were
already cases where we were forced to do that. Quirin said the
one solution he had found was to add a UDA between `scope` and
the type:
```d
scope @0 (ref int function()) fp = null;
```
The UDA could have any meaning, so it wasn't 100% equivalent.
Átila noted that the opening of the DIP said the goal was that
"every type expressible by D’s type system also has a
representation as a sequence of D tokens". But he didn't see
anything about function types anywhere in the DIP.
Quirin said he hadn't put much thought into function types, as
there weren't many places where you could actually use them.
Walter said he thought the whole point of the DIP was function
types. Quirin clarified that it was function pointer types and
delegate types that return a reference or have a linkage that
isn't `extern(D)`.
He said the whole point was that you could write `(ref int
function(args) @someAttributes)`, and this was the type. If you
had something like this as parameters or `return return` types,
it should be a fully formed type. But if you saw it in an error
message, you couldn't copy-paste it into the code because it
wouldn't parse. He said that function types, on the other hand,
were a weird artifact of how they were implemented in the
compiler. That was how he saw them, anyway.
Átila said they could probably be inferred in templates as well.
The reason he was talking about function types was because they
were in C++, but hardly anyone used them except in templates. He
and Quirin then had a bit of a discussion about how function
types were used in C++.
Quirin said he could try to extend the DIP to include function
types. He didn't know how hard it would be. He talked about some
of the difficulties he'd had with linkage in the current draft.
He and Átila talked a bit about what the implementation might
look like.
Walter said the DIP needed a full enumeration of all the
ambiguous cases and how they were resolved. Other people wanting
to implement the language would need a precise guide for how the
disambiguation was handled.
Quirin said he could do that. We'd actually want a simpler, less
lenient disambiguation protocol if we wanted to stay more
flexible for the future. A smart implementation that could
diagnose errors would be more complicated.
Walter said a simpler implementation was definitely something to
consider for people who were going to write their own formatters
and things that required parsing. Simplicity of disambiguation
was important.
Razvan suggested Quirin wrap this discussion up in the interest
of time. He asked if Quirin had gotten any conclusions about the
next steps.
Quirin said that Walter had convinced him on the scope guard
stuff, but he still wanted the implementation to recognize what
the user intended and suggest a way to rewrite ambiguous code.
### Copy constructor generation
Razvan said that Walter had [proposed in a forum
thread](https://forum.dlang.org/post/vdt27v$279p$1@digitalmars.com) a kind of extra syntax for move constructors to distinguish them from normal constructors. Though it seemed on the surface that the new syntax shouldn't affect copy constructors, it actually did. We should want the constructor syntax to be consistent. It would be weird if the copy constructor looked like a normal constructor, but the move constructor had some kind of extra syntax requiring a UDA or additional tokens.
The forum discussion had gotten sidetracked a bit to talk about a
couple of limitations of copy constructors. One was that if you
had a templated constructor that was supposed to be a copy
constructor, there was no way for the compiler to know that
without instantiating it. The second was that if you had a struct
A that had a field of struct B, and A had no copy constructor
defined but B did, the compiler would then define an `inout` copy
constructor that was basically useless and wouldn't be callable.
When Razvan had written the copy constructor DIP, his initial
approach to copy constructor generation was to look at all of the
fields to see what copy constructors they define and do an
intersection---define all those copy constructors and if they
type check, then generate them; if they don't type check, just
disable them.
At the time, Walter had been against that approach as being too
complex. It was easier to generate a single copy constructor.
Razvan said people had been people reporting issues with that
approach, and seeing that `inout` copy constructors were useless,
he had decided to go ahead and implement his original approach
and [submitted a PR for
it](https://github.com/dlang/dmd/pull/16429).
Razvan thought that approach matched what people expected. He
knew that Walter still didn't like it, but he suspected most
people in the meeting would prefer it. He wasn't sure though, so
that was why he wanted to bring it up.
Walter said he'd posted some new comments on the (at the time,
Bugzilla) issue about [inout copy constructor
generation](https://github.com/dlang/dmd/issues/19710) the night
before the meeting.
He then emphasized that this issue was completely orthogonal to
move constructors and he didn't understand why it kept coming up
in that forum thread. It should be in a separate thread.
Jonathan said that move constructors would have the same issue.
Walter agreed, and because it affected both copy and move
constructors, it was a separate issue from either of them.
Timon noted that a move constructor modified the object it was
coming from, so there was a question of what to do with the
qualifiers there. Walter said that a copy constructor with a
non-const `from` argument could modify it.
Timon agreed but said we were now talking about qualified
arguments. It was a similar problem with move constructors
because it also applied to `shared`. He thought it was a separate
discussion in terms of what should be done there.
Walter agreed, then went back to the reported issue. He said he'd
boiled down the simple example to something much simpler that
made the problem much clearer. He agreed that the generation of
an `inout` constructor was just wrong, but if you went through
the fields and some of them were mutable, then you generated a
copy constructor with a mutable rvalue, he didn't see why
multiple copy constructors needed to be generated.
Another thing he didn't like was what happened with `shared`.
That was a weird beast, because when you did things with
`shared`, you didn't do things as you did with normal code.
Trying to make a shared struct work like a normal struct seemed
like something that wasn't going to work anyway, so why worry
about the `shared` thing? All you were really dealing with was
`const` and `immutable`. The thing to do, then, was to look
through the fields for `const` and `immutable`, or for mutable
initializations, and generate a single copy constructor with a
`const`, `immutable`, or mutable argument as required. He didn't
see a reason to have a combinatorial collection of constructors.
Martin said that was what he would expect, too: check through all
the fields and if any required a mutable reference, then the
parent aggregate's copy constructor would require a mutable
reference, too. Check for `immutable`, then `const`, then
mutable, then we should be finished.
Walter said if copy constructors for the fields were `inout`,
then we'd need to generate an `inout` copy constructor in that
case. He didn't see any issues with it.
Razvan said the issue was that he didn't know how people were
defining their copy constructors. If you had fields with
different kinds of copy constructors, he wasn't sure that
generating just one copy constructor was sufficient to cover all
the cases. And it was technically possible to have a copy
constructor that was `shared`, so what would you do when a field
had that?
Jonathan said that as soon as you had multiple members that were
`shared`, there was no way a generated `shared` copy constructor
could be valid in terms of what it needed to do. He thought it
was no problem, in that case, to just say, sorry, but we're not
going to generate anything for you because we're not sure we can
do the semantically correct thing with `shared`. Even if your
member variables each handled `shared` correctly individually,
the semantics changed once you were dealing with the type as a
whole. We couldn't just do that automatically. So it was fine to
say we're not going to do this for you, you have to write one
yourself.
Walter agreed. If you were going to mess around with `shared`,
you should be explicit and write your own copy constructor to do
what you wanted it to do.
He said the problem with having multiple copy constructors was
that you really wouldn't. The compiler would just end up picking
one, and that's what it would always pick. So there was only one
copy constructor you had to worry about.
Razvan wanted clarification that when we had multiple fields and
only one was `shared`, then a copy constructor shouldn't be
generated. Jonathan confirmed yes because there was no correct
way to define it.
Walter said you couldn’t automatically know what the user
intended since dealing with `shared` would likely require calling
atomic operations and such. The whole concept of copy
construction with a `shared` type seemed problematic to him.
Jonathan added you'd need locks and stuff internally to deal with
it properly, but you had to make sure you'd written your code
specifically to deal with it properly. The compiler would have no
idea how to do that.
Razvan said these rules sounded simple, but thinking out loud,
you could have all kinds of weird cases. If you had a field that
had a mutable to mutable copy constructor, but then you also had
one with a `const` to mutable copy constructor...
Walter said that when you mixed `const` and mutable, then you got
`const`. And if the fields were mutable, then you generated a
mutable copy constructor. If anyone could come up with a case
where this was wrong and he'd overlooked something, he invited
them to let him know, but he thought this approach would work.
He added that although a mutable copy constructor was legitimate,
he thought a lot of people wrote them by default when they should
really be writing a `const` one by default.
Jonathan said for a case he had, he would like to be able to say
you can't copy `immutable`. Walter said there were legitimate
reasons for that, but they were unusual.
This led to a digression about copying `immutable` fields and the
transitiveness of `const`. Then Walter asked Razvan to revise his
PR to try the other scheme they had discussed and see if there
was something they hadn't thought of. If it turned out to be
wrong, they could revisit it.
## Conclusion
With the agenda items covered and having gone about an hour and
20 minutes, Razvan asked if anyone had anything else to discuss.
Martin said he hadn't followed the forum thread on move
destructors and it was too big to dig into now. This launched a
surprisingly long discussion that got deep into the weeds of move
constructors, the C++ implementation, Weka's use case for the old
`opPostMove` DIP, and more. It's a difficult conversation to
follow, and I don't see that anything actionable came out of it,
so I'll skip the summary.
Our next monthly meeting took place on November 8th.
If you have something you'd like to discuss with us in one of our
monthly meetings, feel free to reach out and let me know.
More information about the Digitalmars-d-announce
mailing list