D Language Foundation July 2023 Quarterly Meeting Summary

Mike Parker aldacron at gmail.com
Sat Jul 29 14:15:12 UTC 2023


The quarterly meeting for July 2023 took place on the 7th at 
15:00 UTC and lasted just over 70 minutes.

At the quarterlies, representatives from companies using D in 
production join the core D Language Foundation team to tell us of 
issues that are affecting them, problems they've solved 
themselves, interesting details about their projects, and really 
anything related to their use of D. At the July meeting, we were 
joined for the first time by representatives from Ahrefs, Decard, 
and Auburn Sounds.

Now that we've split the DLF portion of the quarterlies into 
separate monthly meetings, I no longer give each DLF rep a turn 
in the quarterlies unless, once the industry reps have all taken 
their turns, they have anything job-related or that the industry 
reps might be interested in. In this meeting, only Ali and Walter 
had something to say.

## The attendees

* Andrei Alexandrescu (DLF)
* Mathis Beer (Funkwerk)
* Walter Bright (DLF)
* Ali Çehreli (DLF/Mercedes Benz R & D North America)
* John Colvin (Symmetry)
* Mathias Lang (DLF/Symmetry)
* Dennis Korpel (DLF)
* Mario Kröplin (Funkwerk)
* Átila Neves (DLF/Symmetry)
* Mike Parker (DLF)
* Igor Pikovets (Ahrefs)
* Guillaume Piolat (Auburn Sounds)
* Carsten Rasmussen (Decard)
* Robert Schadek (DLF/Symmetry)
* Robert Toth (Ucora)
* Bastiaan Veelo (SARC)

## The summary

### Bastiaan
Bastiaan had nothing for us this time.

### Robert T.
Robert said Ucora had been preparing for DConf but had nothing 
for us beyond that.

### Mathis B. and Mario
Mario said Funkwerk had been having some memory problems and 
handed it off to Mathis Beer for the details.

Mathis said that since somewhere around D 2.096 or 2.100, 
Funkwerk had been using more memory. Unfortunately, it's hard to 
know if it's a real problem. He said when you're looking at a 
process that's 10 GB in production, it's hard to tell if it's 
because you've turned up the amount of data going into it, if 
it's a timing issue, or if there's been a degradation. He 
mentioned a test case [he'd posted to the 
forums](https://forum.dlang.org/post/u7sn1m$1smk$1@digitalmars.com) that runs a hundred threads creating hashmaps, then either deleting them explicitly or setting them to `null`. It ends up with 200kb of GC memory and a few hundred megabytes of residential, which isn't satisfying. What they're seeing is processes that admittedly do bursty memory, but just keep growing.

He said that at some point it stabilizes, but it might be because 
it just GCs so much that the CPU can't process enough data to 
grow further. They have processes in production that they have to 
regularly restart to get memory usage under control. He'd added 
[a warning to the `std.json` 
documentation](https://github.com/dlang/phobos/blob/master/std/json.d#L9) that it's probably not the best framework when you have huge amounts of data. If you have a hashmap for every JSON node, that's going to hit you. Taking in hundreds of megabytes of data on the network and grinding through it is the sort of situation in a threaded system that the GC isn't happy with. With a moving collector, you could compensate in retrospect for a lot of problems and grow your memory, then go back down again. But that's fundamentally not viable in D as it is. Other than the warning, he's not sure what can be done.

Walter noted that the D GC doesn't return memory to the operating 
system when it collects. It just ratchets up. Mathis said they'd 
been manually calling `minimize`, but if your pools have one or 
two false references into it, then nothing can be done there. 
Walter agreed that a moving collector would solve the issue in 
general, but we don't have one because we can't reliably find all 
the references or prove they're all valid references. So we're 
kind of stuck with that.

Mathis said he'd been speculating that they're getting references 
into large pools that are caused by references sort of hanging 
around on the stack doing nothing. They've never been able to 
prove that. He thinks they're persisting because they happen to 
be in some 8-byte region where they're just not getting 
overwritten by anything as functions get called. He's wondering 
if it would be possible to have pointer data for the stack at a 
given point where the GC interrupts it. Like you have detailed 
information about what every register is doing, what every byte 
is doing, at a given point.

Walter said one thing they could do is, for functions near the 
root of the call stack, once they're done with the memory they 
can just set the reference to `null`. That can help with this and 
that's the only thing he can think of; when you're done with 
something, just set the reference to `null` so that it doesn't 
hang around on the stack. Mathis said the problem is that the 
associative array doesn't do that. They can set their references 
to null, but the runtime functions aren't cleaning up after 
themselves and are keeping references around. Mathis said they 
have no live variables. Their theory is that they have stuff 
above the stack pointer that gets skipped as they go into the GC. 
They can pretty much get rid of it by calling `alloca` with 500 
bytes or 2kb before going into those functions.

Walter suggested using `assert(0)` to get a stack trace so they 
can see which functions are still live on the stack, then go to 
those functions and look for any dangling references. Mathis said 
the forum post he mentioned earlier is an example of having live 
references to data that are not in any functions on the stack. 
Walter asked him to post it in the chat. He did (it's the same 
post I linked above, but [here it is 
again](https://forum.dlang.org/post/u7sn1m$1smk$1@digitalmars.com)).

### Igor
Igor started with an introduction of what Ahrefs are doing. They 
have a crawler that crawls the web 24-7, downloads the pages, 
stores them in their database, then serves users some analytical 
information about the pages. So their main product is an index of 
links. Currently, they're working on doing a full-text index of 
the web. Their main language is OCaml. They use C++ and now D for 
more performance-sensitive parts. They started using D seriously 
about a year ago and have two people writing D full-time.

__GC documentation__

The first issue he brought up was related to Funkwerk's issue. 
They're using `std.json` to parse hundreds of megabytes of JSON. 
It's super slow in a multi-threaded app. Upon investigating, they 
found the slowness was caused by a global lock in the GC. This is 
not mentioned in the documentation. It's an important performance 
property that should be explicitly documented. Their solution was 
to drop `std.json` for another library and everything's working 
alright.

(I haven't gotten around to it yet, but updating the GC 
documentation is on my task list. Also, we have since discussed 
std.json in a planning session. I'll have more on that in a 
future update.)

__C++ strings__

Igor said they're interfacing with a lot of C++ libraries for 
AI/machine learning stuff. Unfortunately, D's binding to the C++ 
`std::string` is limited to the old C++ API. In the new API, 
`std::string`'s memory layout has an internal pointer. That's not 
supported in D's runtime model, so D can't bind to the new 
strings. The workaround is to compile the C++ code with the old 
C++ API. That works, but it's annoying because you have to source 
compile everything. So if someone provides a pre-compiled library 
that was compiled against the newer C++ API, you can't use it 
directly. It would be nice to get that fixed somehow if possible.

Mathias Lang said the C++ string was the whole reason we 
introduced copy constructors and deprecated postblits. He doesn't 
remember the details of what needs to be done and will have to 
look at it.

__C++ interop documentation__

Next, Igor said the documentation for D's C++ interface is 
incomplete. For example, they didn't know that it was possible to 
catch C++ exceptions in D. They were doing a workaround until 
they realized that it was possible.

Walter said it wasn't documented because it doesn't work with 
some C++ compilers. Like the Microsoft C compiler. We can't catch 
their exceptions. Igor said that's okay because they're on Linux. 
Walter said he understood. He then explained that it doesn't work 
with the MS exception-handling mechanism because he couldn't make 
heads or tails of how it works. He said the documentation for 
32-bit exceptions was clear as mud, but he was able to get it 
working. The documentation for the 64-bit MS implementation is so 
baffling he gave up on it. On POSIX systems it's pretty 
standardized, so it was easy to figure out. So as long as you're 
not using MS C++, you're okay. Still, we should update the 
documentation to note that it works on Linux and POSIX systems.

Mathias Lang asked which compiler they were using on Linux. Igor 
said LDC. Mathias said that GDC and LDC will pass quite a few 
objects by hidden pointers, so that will prevent a lot of bugs 
you might otherwise encounter with C++ interop. He suggested they 
keep using LDC.

Igor agreed. He said we really should update the docs about this 
even if it isn't working on MS. It's a big point that you can use 
C++ interop on Linux quite easily and a very nice property. And 
if it's possible to somehow get C++ exception handling fixed on 
Windows, maybe by paying someone or through a GSOC or SAOC 
project, that would be nice.

Ali asked if Walter knew anyone at MS who could explain the 
implementation to him. Walter said he didn't know anyone who 
knows how MS exceptions work. And based on the state of the 
documentation, he suspects they don't want anyone to figure it 
out. However, if someone can get it working, he'd welcome the 
contribution. He just doesn't have time for it.

(The documentation on C++ interop has been outdated for a while. 
We're going to add the note about exception handling, but Mathias 
Lang has also agreed to bring the docs up to date. As seen in 
[his DConf presentation last year](https://youtu.be/mI6-PmZy-u0), 
he probably knows more about the state of D's C++ interop than 
anyone else.)

__vibe.d__

Igor said when they use vibe.d for anything, they get two screens 
worth of deprecation warnings from inside vibe.d when they 
compile. He's seen a Bugzilla issue and a forum thread about it, 
and that's nice, but the only solution is to disable deprecation 
warnings globally. So that's what they're doing. That means they 
won't be able to see deprecation warnings about their own code 
that they can fix. This is moderately annoying.

Walter said he would be talking about deprecations later in the 
meeting. Mathias Lang noted that the specific vibe.d deprecations 
Igor mentioned had been fixed in master, there just hadn't yet 
been a new release, and Igor said that was beautiful.

Mathis Beer said Funkwerk does something that's cheating a bit 
and has the potential to break: they split libraries with 
deprecations off and just link against them with their own code 
with deprecations on. That can break when you have templates.

(See Walter's remarks near the end of this summary for more on 
deprecations.)

### Carsten

Carsten said he'd been programming in D since 2008 and had worked 
sort of silently. Initially, he used D for an internal tool at 
the company he had before, but now he's working at Decard and 
they're working on a distributed database for consensus. He said 
it's called a blockchain, but it's not really the same. It's a 
distributed system that reaches a consensus with different nodes 
and then communicates between nodes.

Most of their code is in D so everything is much easier. They use 
some C and Go for some things, but only for some very specific 
stuff. Everything else is in D. The only problem they've had were 
some challenges porting their cryptographic library to Android 
and iOS, but he thought they'd solved that now. He said it was 
mostly because they didn't understand how Android worked.

They've built their own system where they use BDD 
(Behavior-Driven Development) that automatically generates code 
in D, then they write the unit tests there. The workflow of BDD 
with unit tests works. Instead of JSON, they use their own data 
format called HIBON (Has Invariant Binary Object Notation), which 
was inspired by BSON. With HIBON, when you generate a data 
structure you can guarantee that the cryptographic class is the 
same. So the bytes are organized the same independent of where 
they are working.

They would like to publish HIBON because, in principle, it can be 
used for serialization. They've built it in and it's easy to use. 
Thanks to introspection, you can just mix in a HIBON record and 
serialize or construct a class or struct. It makes their flow 
much easier to work with. Everything in their database is stored 
in HIBON. They even have a signed Remote Procedure Call that they 
call HIRPC. That is, instead of asking for permission, you sign 
your RPC so that when you receive it you know if you have 
permission to do something.

They've had problems sometimes with deprecations, but they just 
solve them and move on. Other than that, they did have problems 
before when running multithreading in unit tests, which might not 
be a good idea. Sometimes a test would fail when multithreading 
and it would leak to all the other ones. But since they moved to 
using BDD, they don't have the problem anymore.

Átila said that running unit tests in parallel is probably a good 
idea because you'll find out if you're doing something wrong. He 
asked how they were running them in parallel. Carsten said they 
weren't running tests in parallel. But, for instance, if you have 
a thread test, like a concurrency test or something, and that 
fails, then it seems like it's unpredictable what actually 
happens. Átila said it's because the exception gets gobbled up. 
Carsten said they had also looked at [Átila's unit-threaded 
library](https://code.dlang.org/packages/unit-threaded) and would 
like to incorporate that at some point.

### Guillaume
Guillaume brought up an LDC issue he [had reported a while 
back](https://github.com/ldc-developers/ldc/issues/4388). It was 
taking half of his build time and was a minor annoyance in terms 
of iteration time. (Martin was not at the meeting. I was supposed 
to ping him about Guillaume's issue after the meeting, but forgot 
about it. I emailed him about it as I was writing this summary.)

Next, he said that he distributes shared libraries to a lot of 
customers. He wanted to highlight the fact that a self-contained 
shared library is a valid usage of D. Because he doesn't control 
the host machine, shared Phobos and shared DRuntime can be a 
problem in those cases.

Finally, he said a couple of other people had asked him to 
mention that they would like to see Objective-C support in all D 
compilers. Since neither Iain nor Martin were at the meeting, I 
told him I'd put Objective-C support on the agenda for a future 
monthly meeting or planning session.

### John
John said he was unaware of any significant problems Symmetry had 
that were worth bringing up. He asked Mathias Lang and Robert if 
they had anything. Mathias had nothing, and Robert said he was 
happy.

### Ali
Ali said he was happy. He was using an old version of DMD at work 
to write script-like programs which use large files and was doing 
very good stuff with it. The company were preparing to post a D 
intern position, so their D team would soon increase beyond one 
person.

Walter asked why they were sticking with an old version of DMD. 
The answer: because that's what's installed on their standard 
workstations. And with the nature of the programs Ali is writing, 
it doesn't matter. There's no pressing need to upgrade.

### Walter
Walter took some time to talk about deprecations. He said that 
the night before the meeting, he'd made [a post in the announce 
forum](https://forum.dlang.org/thread/u87vfb$1b9h$1@digitalmars.com) about evolving the language (related: [the planning update](https://forum.dlang.org/thread/jmtcppvsweimsojladlj@forum.dlang.org) in which I talked about our new deprecation policy).

Walter had gotten a lot of feedback that our deprecation scheme 
was a failure. He used to think that deprecations were a good way 
to evolve the language: have the compiler announce a deprecation 
with a warning, then turn the warning into an error a few years 
later. It had become clear to him that this was an unacceptable 
approach. So in his forum post, he described a new way of doing 
things.

We're not going to deprecate things that don't hurt the language. 
We won't remove things just because they're old fashioned, or 
because there's a better way or something like that. The idea is 
to be backward compatible because people hate getting an older D 
package that they try to use and their screen is filled with 
deprecation messages. So we're just going to have to stop doing 
that.

What he proposed, and had partially implemented, as a replacement 
is the `-wo` switch that enables printing warnings for things 
that are obsolete. The idea is that this is not the default 
behavior of the compiler like deprecation warnings are; you have 
to go and ask to get the messages. Then ideally, people won't be 
bothered by improvements to the language. Obsolete features still 
compile, and you only get warnings if you ask for them. They're 
not going to be bothered so much by deprecation messages. We only 
deprecate and remove things if they're really destructive and 
problematic.

There's also the problem that some of the deprecations are for 
features that cause code to be unsafe. The issue here is that the 
compiler is unable to prove that the code is safe. It doesn't 
mean that the code is unsafe or broken. So he planned to modify 
that such that you only get compiler guarantees in `@safe` code 
when you have fixed all the warnings about legacy behavior. That 
will not break existing, working, debugged code, but if you want 
the compiler guarantees about safety, you're going to have to 
turn on `-wo` and fix the warnings it gives you.

Again, the idea is so that people can use older, debugged, 
working libraries without getting a blizzard of complaints from 
the compiler. Projects like vibe.d will not just automatically 
break every time there's a new version of the compiler, or at 
least not intentionally break with screens full of deprecation 
messages. It's perfectly reasonable that people using vibe.d 
don't want to be bothered by deprecation messages that come from 
compiling vibe.d It leaves a really bad impression and anger at 
us for breaking existing projects. So he posted in the forums for 
a discussion to see how it goes.

He had already sent in some PRs to revert some deprecations that 
broke people's code and didn't actually need to be deprecated. 
They're not hurting anything. He had also spoken with Dennis 
asking him to prioritize reverting the deprecation of `alias 
this` in classes. We don't need to deprecate it. It's annoying 
having it in the language, but there's no easy way for people to 
replace it. So our path forward with it is not to improve it or 
fix it---which we've been unable to do and was why it was 
deprecated in the first place---but to encourage people not to 
use it. We leave it in so it will work, code will compile, but we 
can make the `-wo` switch warn about it.

Carsten asked if it would be possible to set a flag module-wise 
or package-wise to enable or disable warnings. This started a 
rather long discussion about allowing `-wo=foo` and/or `-d=foo`, 
and whether `foo` should be a module or package, or a directory 
on the file system. Walter's intent with `-wo` was that it allow 
old code to compile with no warnings *without requiring the user 
to make any changes*. If you require the user to change the build 
system to silence warnings, then you're requiring them to make 
changes.

The conversation stayed largely on topic, with people making 
points about deprecations (Carsten said he likes them, and his 
company update their code whenever they encounter them; Walter 
talked about how MS went to great lengths in developing Windows 
95 to avoid breaking existing Windows applications that were 
using the system APIs incorrectly but still managed to work; 
etc), compiling code with different flags, how the issue might 
not be deprecations themselves but the fact that people were 
getting multiple lines of the same message, and so on.

Andrei used a couple of analogies to make the argument for more 
fine-grained control ("In C++ this `#include` file that I'm 
including in a library from the 1960s. And what's with that 
unqualified whatever, whatever. I don't care for that stuff. 
There should be a better response for that than just `-wo`. It's 
got to be `w0=` then a list of package names or something like 
that.", "The refinement of a bad idea can be an excellent idea. 
Just look at the Porsche 911. The early prototype is a horrible 
car. It's just ugly. And then they just changed the pit and it's 
a great car.").

In the end, Walter said we can enhance `-wo` to give it a list of 
modules that it applies to. That's a very reasonable thing.

(I'll have more information about `-wo`, deprecations, and plans 
for evolving the language in future summaries and updates.)

## Conclusion
Our next quarterly meeting should take place on October 6. We 
held our July monthly meeting on the 14th. I'll have the summary 
of that one published in the next few days. I'll also soon post 
an update from the two planning sessions we held on July 21 and 
28.






More information about the Digitalmars-d mailing list