D Language Foundation August 2024 Monthly. Meeting Summary

Mike Parker aldacron at gmail.com
Wed Dec 18 04:34:25 UTC 2024


The D Language Foundation's monthly meeting for August 2024 took 
place on Friday the 9th. It lasted about an hour and forty 
minutes.

## The Attendees

The following people attended:

* Walter Bright
* Iain Buclaw
* Rikki Cattermole
* Jonathan M. Davis
* Timon Gehr
* Martin Kinkelin
* Dennis Korpel
* Mathias Lang
* Razvan Nitu
* Mike Parker
* Robert Schadek
* Quirin Schroll
* Adam Wilson

## The Summary

### Replacing D's escape analysis

Rikki said he'd spoken with Dennis a month ago about trying to 
simplify D's escape analysis, but nothing had come of it. At 
BeerConf, he'd brought it up again and Dennis said he'd been 
thinking about it. Rikki had also spoken to Walter about it, and 
Walter had said that DIP 1000 wasn't quite doing what we wanted 
it to do and was a bit too complex.

As such, Rikki wanted to discuss the possibility of replacing DIP 
1000 as D's escape analysis solution. He thought the first step 
before making any solid decisions was to make sure it was fully 
under a preview switch. Dennis confirmed that it currently was.

Rikki said the next step was to think about replacing it and what 
that might look like. He asked for suggestions.

Dennis said that before deciding on how to replace it, we should 
first state what was wrong with the current design and the goals 
of a replacement. He said he had some issues with it. One was the 
lack of transitive scope. Another was that structs only had one 
lifetime even if they had multiple members. He'd been thinking of 
allowing struct fields to be annotated as `scope`, but he didn't 
have a concrete proposal yet.

Walter said the difficulty he'd encountered wasn't that DIP 1000 
was complicated, but that the language was complicated. You had 
to fit it into each of the language's various constructs. How did 
reference types work? Or implicit class types? Or lazy arguments? 
Constructors? That was where the complexity came from.

He gave the example of implicit `this` arguments to member 
functions. He'd explained over and over again that if anyone 
wanted to understand how they worked with DIP 1000 in 
constructors or member functions, the thing to do was to write it 
out as if `this` were an explicit argument. Then you'd be able to 
see how it was supposed to work. But people found that endlessly 
confusing.

Any proposal to simplify it would also have to justify how it 
could possibly be simpler than DIP 1000 was now, as DIP 1000 was 
complicated because the language was complicated. It had to 
support every language construct.

Rikki said there were three different levels of escape analysis. 
The most basic level was "this is an output, and this contributes 
to the outputs". Then you had what the language was able to 
infer. Then you had what the programmer could add that could be 
proven. We didn't really have that scale, so now there was no 
escaping when things were just too broad.

He said he would also like the `this` reference and the nested 
encapsulation context to be explicit arguments that you could 
annotate when you needed to, e.g., to declare it couldn't be 
`null`.

Walter noted that Herb Sutter had put out a proposal for his 
revamped C++ language requiring `this` to be explicit. That would 
resolve confusion regarding implicit arguments. But there was 
also the case of implicit arguments when you had a nested 
function. Those were hidden arguments and had the same issue. He 
didn't see any straightforward solution because it was a 
complicated problem.

He reiterated that the complexity of DIP 1000 was due to the 
complexity of the language and not to the concept itself, which 
was very simple. If you wrote it out using pointers, everything 
was clear and simple. It was when you added things like `auto 
ref` that it started getting complex. He'd never liked `auto ref` 
and never used it because it was just confusing.

He said if Rikki could think of a better way to do it, he was all 
for it. DIP 1000 was his best shot at it.

Timon said that DIP 1000 arguably already incurred some of the 
complexity cost of being able to annotate different levels of 
indirection, but it didn't allow you to do that in general. There 
was probably a better trade-off there.

Walter said there were two kinds of indirection: pointers and 
references. That doubled the complexity of DIP 1000 right there. 
Timon agreed but said it meant that DIP 1000 was not what you got 
when you translated everything to pointers. DIP 1000 was a step 
up from that because it actually had two levels of indirection 
per pointer, but it was restricted in a way that wasn't 
particularly orthogonal. How you annotated either level of 
indirection depended on the construct.

Walter agreed and said that was because references had an 
implicit indirection and pointers did not. He asked what could be 
done about that.

Rikki asked everyone to let him know if they had any ideas.

Quirin said that he understood the aim of DIP 1000 to be that you 
could take the address of a local variable, like a static array 
or something, and the pointer would be scoped and unable to 
escape. So it might be the case that in a future version of the 
language where DIP 1000 was the default, there could be a 
compiler switch to disable it so that taking the address of a 
local variable would then be an error.

He said the issue he'd run into was that if you had a system 
function but didn't actually annotate it as `@system`, and then 
you had a `scope` annotation, the compiler would assume that you 
were doing the scope thing correctly. But if you weren't, you 
were screwed. It was very easy to do accidentally.

This was an issue with DIP 1000. You could shoot yourself in the 
foot in system code. Not in safe code if you were doing it 
correctly, but if you were a beginner and didn't annotate 
something `@safe` and then used, for example, `-preview=in`, 
which was implicitly scope, you could get into trouble.

So he thought having the option to disable that stuff but enable 
all the checks of `scope` and things like that in `@safe` code 
would be good.

Walter said `@system` turned off all the checks because sometimes 
you needed to do nasty things. And beginners shouldn't be writing 
`@system` code. Quirin said if you didn't use `@safe`, DIP 1000 
made the language more dangerous. He thought this might be why 
some people had a problem with it.

Walter thought the biggest problem was that people didn't like to 
write annotations. The only reason they were necessary was for 
the case where you didn't have a function body. If you just took 
the address of a local, the compiler would say, "Okay, that's a 
scope pointer now". It would do that automatically. You didn't 
have to do anything extra for that. The difficulty was in the two 
places where you needed to add annotations: when there was no 
function body and in a virtual function. The compiler couldn't do 
it automatically in those cases.

Jonathan said part of the problem DIP 1000 was trying to solve 
was something he didn't care about. He was totally fine that 
taking the address of a local meant you had to avoid escaping it. 
It was nice to have extra checks for it, but all those 
annotations got very complicated very fast, and a lot of it was 
because of how complicated the language was.

For the most part, he wouldn't want DIP 1000 on at all except in 
very specific circumstances where he wanted some extra safety. 
Not having it was actually simpler. If you were only taking the 
addresses of locals in a small number of places, and therefore 
those functions were `@system`, then most of your code was safe 
and you were fine. But once you turned on DIP 1000, you ended up 
with `scope` inferred all over the place, and then figuring out 
what was going on became far, far more complicated.

He said it seemed like a lot of complication to try to make 
something safe, which most code shouldn't need to worry about 
anyway. If you were using the GC for everything, then you 
typically only had to take the address of things in a small 
number of places. All the complications around `scope` didn't 
really buy you anything. It just made it harder to figure out 
what was going on.

If it were a problem that needed a solution, he would love it if 
we could solve it in a simpler way. He had no clue how we might 
go about that, but if it were up to him he'd rather not have it 
at all because of the complexity that it brought.

Walter said the language had recently been changed to allow `ref` 
for local variables. That allowed for more safety without needing 
annotations. He thought it was a good thing. It improved the 
language by reducing the need for raw pointers. The next step 
would be to allow `ref` on struct fields. The semantics of that 
would have to be worked out, but the more you could improve the 
language to reduce the need for raw pointers, the more inherently 
safe it would become, and there would be fewer problems.

Adam said someone in Discord had suggested that we not build 
Phobos v3 with DIP 1000 turned on. He kind of agreed with that 
view. He'd told Walter before that he thought DIP 1000 had been a 
huge waste of time for minimal gain.

Rikki wanted to point out that without reference counting, there 
was basically no way we could do 100,000 requests per second. 
That was gated by Walter's work on owner escape analysis, and 
that in turn was gated on escape analysis. So he was blocked on 
this, and that was why he wanted to get escape analysis sorted.

Walter said the reason the ROI was so low was because it was 
rather rare that people had errant bugs in their programs because 
of errant pointers into the stack. Mathias asked why we were 
spending so much time on it in that case. Walter likened it to 
airplane crashes: they were rare, but they were disastrous when 
they happened. You couldn't be a memory-safe language and have 
that problem.

Mathias said that DIP 1000 made him want to use D less, not more, 
because of all the sea of deprecations he got when he enabled it 
with vibe.d. It was just terrible. He was hoping it would never 
be turned on by default.

When it came to the DIP itself, he said that composition just 
didn't work. Any design that required him to annotate his class 
or struct with `scope` in the type definition was dead on 
arrival. He said a lot of people compared it to `const`, which 
was the wrong comparison. `const` was outside in, but `scope` was 
inside out. So if your outer layer was `const` and you composed a 
type with multiple layers, then all your layers were `const`. 
With `scope` it was the other way around. We had no way to 
represent the depth of scopeness in the language. It wasn't 
possible grammatically. It was just unworkable and unusable.

I suggested we put a pin in the discussion here and schedule a 
meeting just to focus on DIP 1000. Everyone agreed.

(__UPDATE__: We had the meeting later and decided we needed to do 
two things to move forward: compile a list of failing DIP 1000 
cases to see if they are resolvable or not; and consider how to 
do inference by default. I have no further updates at this time.)

### Improve error messages as a SAOC project

Razvan said that Max Haughton had proposed improving compilation 
error messages a while back as a potential SAOC project. The goal 
was to implement an error-handling mechanism that was more 
sophisticated than the current approach of just printing errors 
as they happened. The details had yet to be hashed out, but the 
main idea was to implement an error message queue.

One of the problems with the current approach was that errors 
were sometimes gagged during template instantiation. What we 
wanted to do was to save them somewhere so that they could be 
printed when returning to the call site. This would be quite 
useful also for users of DMD-as-a-library.

With SAOC on the horizon, Razvan wanted to avoid the situation 
where the judges accepted an application for this project, and we 
later decided we didn't want to go this route for some reason.

Rikki suggested the queue should be thread-safe, as he needed it 
for Semantic 4. It had been on his TODO list to write exactly 
that, so the project had his support.

Dennis asked what wasn't thread-safe about the current mechanism 
with its global error count. Rikki said he hadn't looked into it, 
but in a multi-threaded scenario, any thread that threw would 
need to write the error out on the main thread. He didn't think 
the functionality was there.

Walter said he'd refactored error handling as an abstract class, 
so it could be overridden to do whatever we wanted. We could make 
it multi-threaded or whatever. The transition to using it was 
incomplete because gags were still in there, but one of the 
reasons he'd done it was to get rid of gags, and that would 
eliminate the global state. He told Razvan that anything like the 
proposed project should be built around instantiations of that 
class.

Razvan asked if that meant he had Walter's approval for the 
project. Walter said he didn't know what it was trying to 
accomplish so he couldn't say just yet.

Razvan gave the real-world example of calling `opDispatch` on a 
struct. Maybe the body had some errors and failed to instantiate. 
You had no way of knowing that at the call site. It would just 
look like `opDispatch` didn't exist on that struct. Right now, 
without knowing why it failed, there was no way to output a 
decent error message. The error was going to say that there was 
no field or member for that struct.

He said there were other examples. The project aimed to save the 
error messages instead of just tossing them in the dumpster so 
that an accurate error message could be output to the user back 
at the call site. When fixing some bugs in the past, he had 
needed to resort to all kinds of hacks to decide why something 
was failing.

Walter said he thought that was worth pursuing. But it would 
involve getting rid of the gagging entirely and replacing it with 
another abstract function or another error handler instantiation. 
Razvan said that wasn't necessarily true. When errors were 
gagged, you could save the state instead of printing them out.

Mathias thought it was a good idea and should go forward. 
Regarding instantiation errors, he said he saw them most often 
when there was an inference issue. For example, he'd do a `map`, 
but somewhere his delegate did an unsafe operation and he ended 
up with an error saying the overload couldn't be found. He 
wondered if there was a way it could print the error about the 
safety problem instead.

Razvan said that this project would save everything that had 
failed so that a decision could be made at the call site by 
searching through the queue. He didn't know if this could be 
solved in other ways.

Walter asked how you would know at the call site which error 
mattered. Razvan said it depended on the use case. Walter said if 
you printed them all out, then you'd end up with the C++ problem 
of hundreds of pages of error messages.

Razvan said the project would give you a tool to put out better 
error messages than we had now. It wasn't intended to just save 
all the error messages and print them all out. That wouldn't make 
sense. Maybe in time--and he suspected Walter wouldn't like 
this--we might have priority error messages.

Walter said no, normally it was the first error that mattered. If 
you just logged the first error message, you'd be most of the way 
to where you were trying to go. Razvan agreed that would be one 
strategy.

Jonathan said that once we had the list, there were different 
things we could do with it. There might be a flag that puts out 
five error messages instead of one, or maybe an algorithm to 
enable it to go more intelligently. If we decided it wasn't doing 
anything for us we could always get rid of it later. But just 
having the list of error messages would enable us to do more than 
we currently could without it, though it might be hard to figure 
out how to use it in some circumstances.

Walter said as an initial implementation, he'd suggest just 
logging the first error message and see how far that got us.

Martin said it was okay as just another straightforward 
implementation of the abstract error sink. What worried him was 
if any extra context was needed, like different error categories 
or warning categories, or instantiation context, that kind of 
stuff. If we needed to extend the interface to accommodate that 
sort of thing, it might get hairy. Interface changes might come 
with a performance cost for compilers that weren't interested in 
the feature. That was something to be wary of.

He said another thing was that we already had a compiler switch 
to show gagged errors.

Third, there were circumstances in which some code only worked 
when a template instantiation was semantically analyzed a second 
time due to forward references or something. If we just went with 
a simple approach, an error on the first analysis of an 
instantiation could be invalidated on the second analysis. But 
even in that case, it might be nice to have the error to let you 
know about the forward reference.

Razvan agreed there could be some problems with this approach, 
but he didn't see any definite blockers. No one objected to 
moving forward with the project.

(__UPDATE__: Royal Simpson Pinto was accepted into SAOC 2024 to 
work on this project.)

### Moving std.math to core.math

Martin said he'd been wanting to move `std.math` to `core.math` 
for years. It had come up in discussions with Walter quite a 
while ago in GitHub PRs, and he recalled Walter had agreed with 
it. It had come up again more recently in attempts to make the 
compiler test suite independent of Phobos. With DMD and the 
runtime in the same repository now, it would be nice for all of 
the make targets to be standalone with no dependency on Phobos 
just to run the compiler tests.

He'd experimented and found that most of the Phobos imports in 
the test cases were `std.math`. One common reason was the 
exponentiation operator, `^^`. There were also some tests that 
tested the math builtins.

Calls to the standard math functions were detected by the 
compiler at CTFE using the mangled function names. That was 
already a problem because when we changed an attribute in 
`std.math`, we needed to update the compiler as well due to the 
new mangled name. So we tested that all of that worked and that 
the CTFE math results complied with what we expected. So there 
was an implicit dependency on Phobos.

Martin said he wanted approval before going ahead because it 
wouldn't be worth it to get going and then be shut down. He 
wanted to make sure everyone was on board with it and that there 
weren't any blockers to be aware of. Phobos would import and 
forward everything to `core.math`, which already existed in the 
runtime. It had something like five functions currently.

LDC already did some forwarding of math functions. `std.math` was 
one of the few Phobos modules in which LDC and GDC had some 
modifications, and that was just to be able to use intrinsics. 
Moving it into the runtime would be nicer as it would minimize or 
eliminate the need for their Phobos forks.

Walter said that `std.math` was kind of a grab bag of a lot of 
things. He suggested just moving things into DRuntime that should 
be `core.math` and forwarding to those, then changing the test 
suite to use `core.math`. He wanted to keep `std.math`. There was 
still a lot of room for math functions that didn't need to be in 
the compiler test suite, and they could remain there.

Jonathan said that in the past when we decided we really wanted 
something in DRuntime that had been in Phobos, but we really 
wanted people importing Phobos, we moved the thing to 
`core.internal`. For example, `std.traits` imported 
`core.internal.traits` to avoid duplicating traits used in 
DRuntime, and users could still get at it through `std.traits`.

In the general case, it was just a question of whether we wanted 
`core.internal` or something more public. He'd prefer going with 
`core.internal` where possible, but either way, he saw no problem 
with the basic idea.

Rikki said if we were talking about primitives that the compilers 
recognized and that were currently living in Phobos, then yeah, 
move them. Full stop, no questions asked. He asked if anyone had 
an objection to that. When no one did, he said that was the 
answer to Martin's question.

Martin said the thing he didn't like about that was that we were 
drawing a line. Where should it be drawn? It wasn't just about 
the builtins. The list of CTFE builtins might not be complete. 
There might be some functions that should be in there but 
weren't. But really, most of the functions in `std.math` were 
detected by the compiler.

As far as he knew, `std.math` was quite nicely isolated and 
didn't depend on anything else in Phobos. He would double-check, 
but he was certain it was good in that respect so that it could 
just be moved over. He really didn't want to split it up. If it 
was in the runtime, it was logical to include it directly from 
there, starting from some specific compiler version and keeping 
it in the Phobos API for a while for backward compatibility. So 
the final location would be in `core.math`.

He said we did the same thing for the lifetime helpers. `move` 
used to be in Phobos. That was a totally bollocks decision. How 
could such a primitive function be in the standard library 
instead of the runtime? But now it was in the runtime, 
unfortunately with slightly different semantics, and he'd been 
using it from there for ages.

Walter said the dividing line was simple: if you wanted to put it 
in the compiler test suite, it needed to go in the runtime. 
Martin said he would need to check, but he thought it would be 
most of the functions anyway.

Mathias thought we should get rid of the exponentiation operator, 
though that wouldn't solve Martin's problem. Martin said moving 
it to the runtime would get rid of the special case where you got 
the error trying to use it when you didn't import `std.math`. At 
least we'd have that. Walter agreed with Mathias that it should 
go. He thought it was an ugly wart in the language.

Adam said that Phobos 3 was a great opportunity for the change. 
It was a natural dividing line. We could keep Phobos 2 as it was 
and support it for a long time, but Martin could do whatever he 
wanted in Phobos 3. Adam had already been looking at `std.math` 
and thinking how much he dreaded porting it over. So if Martin 
came up with something else and told him how to make it work, 
he'd make it work.

### Primary Type Syntax DIP

Quirin had joined us to discuss [the current draft of his Primary 
Type Syntax 
DIP](https://forum.dlang.org/thread/zymqcnpjcpuphpeulhev@forum.dlang.org) (that was the second draft; his most recent as I write [is the fourth draft](https://forum.dlang.org/thread/cekqyahwnumvesppxsfs@forum.dlang.org)).

He assumed most of us had not read through the entire thing, as 
it was a long text. He thought most DIPs were really, really 
short and missed a lot of detail. He felt that anything that 
touched on it that entered your thoughts should be a part of a 
DIP.

The basic idea of the proposal was that we modify the grammar 
without any change to the semantics or anything like that. It 
aimed to ensure that any type that could be expressed by an error 
message, for example, could be expressed in code as well and you 
wouldn't get parsing errors. You might get a visibility error 
because something was private, but that was a semantic error, not 
a parsing error.

He said the easiest example was a function pointer that returned 
by reference. This could not be expressed in the current state of 
D. The DIP suggested we add a clause to the type grammar allowing 
`ref` in front of some basic types and some type suffixes. What 
had to follow obviously was a function or delegate type suffix, 
and this formed a type but not a basic type. The difference was 
meaningful because, for a declaration, you needed a basic type 
and not a type.

It also suggested that you could form a basic type from a type by 
putting parentheses around it. This was essentially the same as a 
primary expression, where if you had, e.g., an addition 
expression, you could put parentheses around it and then multiply 
it with something else. But you had to put the parentheses around 
it because it would otherwise have a different meaning.

So to declare a variable of a `ref`-returning function pointer 
type, you had to use parentheses:

```
(ref int function() @safe) fp = null;
```

Rikki said that based on his knowledge of parsers, this could be 
difficult to recognize. The best way forward would be to 
implement it and see what happens. If it could be implemented 
without failing the test suite, it shouldn't be an issue and 
could go in.

Quirin said he had started implementing it for that reason. So 
far, it hadn't been a problem. He'd needed to modify something to 
do a further look ahead, but that was a niche case, and he had no 
idea why anyone would write such code. But he hadn't found any 
issues because the language usually tried to parse stuff as 
declarations first. When it didn't work, then it parsed as an 
expression. If it succeeded in parsing as a declaration, it just 
worked.

Walter said that there was a presentation at CppCon in 2017 
titled, ['Curiously Recurring C++ 
Bugs`](https://youtu.be/lkgszkPnV8g?si=2dNk7AVdI75_80Eo). One of 
the problems they went into was things like this. Was it a 
function call or a declaration? C++ apparently had all sorts of 
weird errors around things like this. So when you were talking 
about adding more parentheses, there was a large risk of creating 
ambiguities that led to unexpected compiler behavior.

In adding more meaning to parentheses in the type constructor, 
we'd need to be very sure that it didn't lead to ambiguities in 
the grammar, where users could write code that looked like one 
thing, but it was actually another completely unintended thing. 
He didn't know if the proposal suffered from this problem, but he 
suggested caution in adding more grammar productions like this.

Quirin said there were two grammar productions. One was the 
primary type stuff, and the other was just allowing `ref` in 
front of some part so that you could declare a function pointer 
or delegate that returned by reference. He thought the latter one 
should be uncontentious. The only problem was that you could just 
put `ref` in front of something because it was a `ref` variable, 
or a parameter that was passed by reference, and it didn't apply 
to the function pointer type.

Walter said that with the function pointer type, you had two 
possibilities. One was that the function returned by reference, 
and the other was that it was a reference to a function.

Quirin said that was exactly like his second example where you 
had a function that returned a reference to a function pointer 
that returned its result by reference:

```
ref (ref int function() @safe) returnsFP() @safe => fp;
```

You needed the parentheses here to disambiguate.

Walter said D already had a syntax where you could add `ref` on 
the right after the parameter list, and that meant the function 
returned by reference. But D allowed ref in both places to mean 
the same thing, which was an ambiguity in the language.

Quirin said the problem was that each time someone asked about 
this on the forums, the answer was "you can't return a function 
pointer by reference". People complained about putting `ref` 
after the parameter list because it felt unnatural. His DIP was 
trying to make it work with `ref` in front. And if you needed 
parentheses to disambiguate, then you needed parentheses.

Walter wasn't saying Quirin was wrong. He just wanted to put up a 
warning flag that `ref` was currently allowed in both places. 
Changing that could break existing code and result in ambiguity 
errors in the grammar. That was his concern.

Quirin said he had an implementation for the proposal, and the 
implementation for `ref` worked as intended. He'd played around 
with it for quite a while and really tried to push some limits. 
He'd found no issues with it.

He said the same issue applied to linkage. Like a function 
pointer with `extern(C)` linkage. The issue there in his 
implementation was that it didn't apply the linkage to the type. 
He could parse it, but he couldn't apply it, and he didn't know 
why. But the whole of the thing worked perfectly. The example 
code he was showing wasn't fantasy code. It was compilable with 
his local compiler.

Walter asked Quirin to watch the video he'd mentioned. He said 
that maybe Quirin had solved the problem, but asked that he 
please review it for grammar and parentheses problems and make 
sure the proposal didn't suffer from them.

There were some questions about the details of the DIP that 
Quirin addressed, and Rikki suggested an alternative to consider 
if it didn't work out. He said it appeared that there was no real 
blocker here.

Walter said it was a laudable goal and he liked it. He just 
wanted to make sure we didn't get into that C++ problem of an 
ambiguous grammar that could be an expression or could be a type, 
then the compiler guessed wrong and caused hidden bugs.

Quirin said he had initially thought this would cause some weird 
niche problem somewhere and that he'd probably find one if he 
implemented it. Miraculously, it just worked. The implementation 
was there and anyone could play around with it. It was so much 
easier than reading a proposal and trying to work it out in your 
head.

Walter said it would be a pretty good thing to try it on the 
compiler test suite. Quirin agreed.

### The 'making printf safe' DIP

Dennis was wondering about [the DIP to make `printf` 
safe](https://forum.dlang.org/post/v7740t$1q51$1@digitalmars.com). It was mostly meant for DMD, which wanted to become safe. But DMD had the bootstrap compiler situation. Was the plan to wait five years until the bootstrap compiler was up to date, or could we have some shorter-term solutions to make DMD's error interface `@safe` compatible?

Walter asked why we needed such an old version for the bootstrap. 
In the old days, his bootstrap compiler was always the previous 
release. Why were we going back so far?

Martin said it was because we had the C++ platforms. If we newly 
conquered a platform using D, the most practical thing to do 
currently was to use GDC. The 2.076 version had the CXX front end 
with those backported packages. That was what he recommended to 
every LDC package maintainer. They were all concerned about the 
bootstrapping process. So he always pointed them to GDC for 
bootstrapping the first version. Then they were free to compile 
more recent versions.

He said the ideal situation was that we could still use that 
specific GDC release to compile the latest version. As far as he 
knew, that was the status quo. So we didn't have to do multiple 
jumps. Just compile that GCC version, which was still completely 
C++, and then you could compile all the existing D compilers 
using that GDC.

So whenever we had a new requirement for new features, then it 
was going to become a multi-step process. That wasn't a problem 
for us but would be for the package maintainers. If we were doing 
good, then we weren't putting too much pressure on them. Most of 
them did it in their spare time, making sure they had D compilers 
for their platforms. If we made the bootstrapping process more 
complicated for them, they wouldn't appreciate it.

Iain said that it was 2024 and people were still inventing new 
CPUs. He'd had [Chinese guys inventing their own MIPS 
CPU](https://en.wikipedia.org/wiki/Loongson) having to drag out 
the old GDC version and port it to their CPU just to get LDC and 
DMD working on it. That was another modern chip that was up and 
coming. It was keeping those guys happy having a modern version 
of the D compiler rather than the C++ version so that they could 
jump to the latest. So that older bootstrap version was 
completely invaluable.

Walter said okay. It wasn't critical that the D compiler source 
code be made safe. It was just something he would like to do. But 
if it was going to cause a lot of downstream problems, then of 
course, what else could we do?

Iain said we'd have to make the documentation very loud and very 
explicit. GDC did pretty well at this, explaining what you had to 
do if you were starting from a given version of the compiler 
because certain versions of GDC were written with a specific C++ 
standard. To get to the latest, you had to go through these 
versions from whatever your starting point was. We should agree 
to do the same for DMD as well.

Rikki noted that Elias had done a new dockerization image of LDC 
which did the bootstrap from the LTS version of it up to the 
latest. He said we should be able to dump the compiler code base 
as C++, and then use that to bootstrap the same compiler version. 
He'd been thinking about that for a long time. It wasn't a 
problem today, but it would become a problem down the road.

Dennis asked if he meant exporting the compiler source as C++, 
and Rikki said yes. Martin said he very much disagreed. It wasn't 
like clang was transformable to C code so it could be 
bootstrapped with a C compiler.

Regarding the LTS version of LDC, he had dropped it because he 
didn't want to backport platform support in the compiler, in the 
runtime, in Phobos, into a very old version with many, many, many 
changes in between, just to get a bootstrap. That was stuff that 
Iain had already taken care of. That was extremely important work.

He said at some point we'd end up in a situation where we 
wouldn't be able to compile the latest with a very old compiler. 
There would be some steps needed in between. But any changes we 
made should be simple stuff. We could add `@safe` here or there, 
or use native bitfields, or whatever. We just had to make a very 
conscious decision to introduce new steps only when we really 
needed to.

Iain added that whenever we introduced a new feature to the 
compiler implementation, it shouldn't be anything fringe. It 
should be a well-established feature that was stable and that we 
knew was working, and happily working for at least five years.

Martin suggested using cross-compilation when experimenting on 
new platforms, and the discussion veered off onto that for a 
while. Then Dennis brought us back to the original point.

He thought we all agreed that the bootstrap situation made it 
kind of complex to add new `printf` features. He wondered if 
there could be an alternative to, e.g., `error("%s", 
expr.toChars()`, where we used the `printf` format that included 
the length and had a function that could return a tuple of the 
length and pointer that was compatible with C varargs, e.g., 
`error("%.*s", expr.toPrintfTuple().expand)`. This would be 
compatible with the old compiler. The new compiler could do its 
safety checks, but the old compiler would still work without 
them. This would allow us to make a `printf`-based error 
interface safe with new compilers while not breaking anything. 
We'd just have to ditch the magic format string rewriting in the 
DIP.

Martin said that sounded valuable. All we needed was to make sure 
it compiled with the older compilers, and because our test suite 
was using newer compilers as well, then this would ensure we had 
test coverage for the implementation of the new thing.

Walter added that the goal of fixing `printf` here wasn't just to 
fix `printf`, but to get rid of the incentive to use C strings in 
the front end. Right now, half of the data structures used C 
strings and the other half used D strings. Fixing the `printf` 
issue would enable us to tilt the source code toward using D 
strings everywhere.

Dennis noted that `toPrintfTuple` could just convert a D string 
to a `printf` tuple. Walter thought it was a good idea. Mathias 
agreed and asked why we were still using `printf` strings in 
2024. We had type information. Why were we even passing `%s` in 
`std.format`? Tango had a better format for it. C#, Java... they 
had all solved this problem differently. Why were we using it?

Walter said it was because `writeln` sucked. Mathias asked why we 
couldn't fix it. Walter said that right now that was on Adam. 
They had discussed it. The problem was that `writeln` was 
absurdly complicated. If you put it or `writefln` in a piece of 
code, you'd get a blizzard of template instantiations. That made 
it really difficult when you were looking at code dumps to try to 
isolate a problem. With `printf` it was really simple. It was 
just a function call: push a couple of arguments on the stack, 
call a function, done.

Another issue was that `writeln` itself was a bunch of templates. 
The error sink was an abstract interface. He thought that was an 
ideal use case for an abstract interface and it worked great. 
`writeln` was not an abstract interface. It was an overly 
complicated system.

Dennis asked if a viable alternative to `printf`-based errors 
could be that we created a minimal template version of `writeln` 
for DMD, since DMD mostly only concatenated strings and 
occasionally formed an integer. Walter said we could write our 
own `printf`, but the one in the C standard library was the most 
battle-tested, debugged, and optimized. Dennis emphasized that we 
only needed to concatenate strings. We didn't need things like 
battle-tested float conversion for that.

Johanthan suggested we just wrap it. Dennis said that was also 
okay.

Martin said that DMD at the moment didn't depend on Phobos 
because doing so was a big can of worms. We could write our own 
stripped down version of `writeln` that we needed. But then there 
were similar things in other parts of the code base, like path 
manipulations and stuff. All of that was stuff we already had in 
Phobos, yet had to implement from scratch using some dirty 
`malloc` stuff. That would be one of the first problems.

The second problem was using C varargs for error strings and 
such. This was one of those ABI issues that were hard to get 
right. They were a very platform-specific, special-case, complex 
part of the ABI. This introduced difficulty when conquering a new 
platform in trying to get the compiler to compile itself. If we 
could ditch C varargs and use proper D stuff, that would make it 
all easier.

Adam said that he had talked about simplifying `writeln` and the 
`std.conv` stuff, but he'd found that people protested when 
anyone suggested getting rid of any templates they liked. He was 
on board with what Walter said about `writeln` being problematic 
because it was a blizzard of templates. But he kept hearing from 
people that we shouldn't remove these templates.

Jonathan said that we couldn't be removing the templates for 
range-based stuff. For things like the `write` and `std.conv` 
families, the problem was that they were using templates to take 
your arbitrary type and convert it to a string. The alternative 
was to hand them a string, which meant you had to do the work 
upfront yourself. That might work internally in DMD, but not in 
Phobos.

Regardless, the implementation we had could be improved. It was 
quite slow from what he'd seen. So even if we opted to keep the 
blizzard of templates, we needed to redo it.

Walter reiterated that `printf` was much maligned, but it was the 
most debugged, optimized function in history. Maybe a `writeln` 
could be implemented that just forwarded calls safely to 
`printf`. It had its problems, which was why he had put forward 
the safe `printf` proposal.

He said Jonathan was absolutely correct that templates gave a lot 
of advantages to `writeln`. He wasn't arguing with that. But when 
trying to debug the compiler, dealing with `writeln` was a giant 
pain. That was why he always went back to `printf`. And he didn't 
want the compiler dependent on `writeln`, because then we'd be 
unable to bootstrap the compiler.

Jonathan agreed that we didn't want DMD dependent on Phobos. In 
that case, maybe just wrapping `printf` with something that took 
a `string` and converted it to a C string was the way to go. 
Walter said that was what the safe `printf` proposal did, it just 
had the compiler rewrite the `printf` expression to make it 
memory-safe. Jonathan said we could avoid calling `printf` 
directly with a wrapper function instead. Either way, the 
compiler's situation was different from the general case.

He said we definitely needed to rewrite `writeln` to make it more 
efficient. It wasn't appropriate for the compiler, though, since 
it was doing all kinds of stuff the compiler didn't need.

We left the topic there and moved on to the next one.

### void-initializg a ref variable

Dennis asked if everyone agreed that `void` initializing a `ref` 
variable should be an error. [The DIP didn't specify 
it](https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1046.md), and he didn't think there was any use case for it. Walter said that was an error. No one objected.

### Scopes and auto ref

Dennis asked if everyone agreed that the keywords `auto ref` on 
variables must be together and not apply with the keywords in 
different scopes, e.g., `auto { ref int x = 3; }`. Walter said 
yes, kill that with fire.

Quirin said he'd noticed that when looking at the grammar, `auto` 
and `ref` didn't always need to be next to each other. It was 
possible, for example, to write `ref const auto foo` in a 
parameter list. He suggested we should ban that. Walter said it 
should be deprecated.

## Conclusion

Given that some of us would be traveling on the second Friday in 
September, just before DConf, we agreed to schedule our next 
monthly meeting on the first Friday, September 6th, at 15:00 UTC.

If you have something you'd like to discuss with us in one of our 
monthly meetings, feel free to contact me and let me know.


More information about the Digitalmars-d-announce mailing list