D Language Foundation July 2024 Monthly Meeting Summary
Mike Parker
aldacron at gmail.com
Thu Dec 12 08:31:00 UTC 2024
The D Language Foundation's monthly meeting for July 2024 took
place on Friday the 12th. It lasted a little over an hour.
## The Attendees
The following people attended:
* Aya
* Walter Bright
* Rikki Cattermole
* Timon Gehr
* Dennis Korpel
* Mathias Lang
* Mike Parker
* Steven Schveighoffer
* Adam Wilson
## The Summary
### Bitfields
The meeting got off to an impromptu start. Walter, Rikki, and I
arrived several minutes early and started chatting. At some
point, the conversation turned to Walter's bitfields proposal. It
continued as I started recording just before the scheduled start
time and the others started coming in. When Timon raised his hand
to comment, I said that this may as well be the first agenda item.
The recording begins with Walter and Rikki debating something
about crossing storage unit boundaries, with Rikki saying it was
needed and Walter saying it wasn't. I don't have the full context
on that, though. Just after that, I said that I was curious about
how often we'd really need to interface with C bitfields from D.
In the popular C libraries I'd maintained bindings for over the
years, there was only one that used bitfields. I just ignored
them, but they were using `uint` so, according to Walter, that
meant they should work out of the box. Even so, how often would
we need to worry about it?
Walter said they were needed in the D compiler. GDC and LDC were
hybrid C++ and D programs that couldn't use bitfields now. I said
that was fine. We could do whatever needed to be done in the
compiler. What I was wondering was how often we would need to use
bitfields in the wild. If it wasn't very often, then requiring
`extern(C)` to use the C compiler layout for mixed language code
and defining our own layout otherwise might be a worthwhile
compromise.
Walter said a more interesting question would be how often
bitfields must conform to an externally imposed layout. He said
that was almost never. The classic case was a hardware register.
But then the hardware register was on your computer and the C
compiler was on your computer, so it would do what the C compiler
did.
Rikki said he'd dealt with it far more often with file formats
and networking than when interfacing with C. It was just normal
in those contexts. Walter asked if Rikki used file formats with
bitfields in them. Rikki said yes, they were pretty common.
Walter asked how C interfaced with that. Rikki said by doing it
manually. Walter said, "Exactly!"
Timon said Walter was basically advocating for not using
bitfields. If he didn't want people to use the feature, he
shouldn't give it a nice syntax. That was just a bad design.
Walter said they would have to disagree on that. Timon asked if
it was a good design to create something that shouldn't be used
but had a better syntax than the thing that should be used. In
that case, yes, he would agree to disagree.
Walter said he wanted to be able to use bitfields in the D
frontend. Timon said he should be allowed to and it should have
nice syntax. He agreed with those constraints. He didn't agree
that we should just copy the C thing into D. He also thought that
if you defined it on the D side, and you defined the C side and
it looked the same, then they should match. He wasn't against
this.
Walter said they agreed on that point at least. If it looked the
same in D and C but did something different, that was going to be
completely unexpected and would potentially be a memory
corruption error. However, it was documented that D bitfields
would match the layout of the associated C compiler.
If you were using another C compiler on the same platform that
didn't match the associated C compiler, then there could be no
guarantees on the C side that the layouts would match. On POSIX
and Mac, he'd never seen a C compiler that didn't match what all
the other compilers did on those platforms, because that would be
madness.
The only place he'd seen differences was on Windows. The
Microsoft compiler did it one way and the GCC ports did it the
GCC way. He thought that was an unfortunate decision, but there
was nothing he could do about it. The Digital Mars C compiler
layout was designed to match Microsoft's. And Microsoft's layout
was designed to match what the other C compilers did in the olden
days on x86.
He said the people writing C compilers weren't crazy, and they
didn't do perverse things just because the standard let them.
They followed the herd because they wanted people to use their
compilers and didn't want to upset their users. So the only place
this had been a problem was between Microsoft and GCC on Windows,
where he'd seen people complain about bitfields and other subtle
differences between them.
Rikki said he had proposed another solution: an attribute. The
default would still match the C compiler, and that was fine, but
then people who wanted an out would have one with a UDA.
Walter said the UDA was a good idea, but he thought doing manual
alignment as he had proposed in the forums was a better one. It
was simple and instantly understandable when you looked at it.
There was no confusion about what was happening and it worked. He
didn't know of a case where he couldn't make it work. You just
lined things up before the bitfield. You could match any layout
you wanted by changing how you laid things out. He did that to
avoid normal member alignment problems. He'd either add or remove
padding fields to make it work so he could avoid having to fiddle
with alignment attributes.
I said that my one argument against this was that a lot of our
user base, younger people coming into D these days, just
generally weren't aware of this kind of thing. They weren't
coming from a background where they needed to know these things.
They were coming from dynamic languages, or Java and C#, or
something else that wasn't C. Experience with that kind of
knowledge about layouts just wasn't there. That would be a big
hurdle for them, and I thought we should keep that in mind.
Walter said that was a great point. But as he'd mentioned on the
forums, that could happen now even without bitfields. Structs had
alignment differences between C compilers, and you had to deal
with that at some point. I said that very rarely came up in my
experience. Walter said that came up for him all the time.
He said there was also byte ordering. Was it Big Endian or Little
Endian? He didn't think anyone was making a Big Endian machine
anymore, but that was a big deal. That was what the `bswap`
hardware instruction was for. Rikki said Big Endian was still
used in networking and wasn't going away anytime soon. Walter
agreed.
Steve said Walter was kind of correct in that if you had to deal
with it, then you would just have to learn how to deal with it.
If the layout were important to you, then you'd learn how it
worked. And if you didn't care, you'd be able to use bitfields
just fine. If you weren't trying to match a specific layout, then
it didn't matter.
His problem was that it was just confusing. If you did want to
know how to do the layout, then you needed to understand all
these arcane rules and which C compiler was being used. It wasn't
obvious why something was laid out in a specific way.
Walter asked what kind of naive users would add attributes to
their bitfields to match a certain layout. It wouldn't occur to
them that they needed to do it. Steve said the naive user wasn't
going to care. They weren't going to be doing layout things. If
they did want to match what a C header did, then they'd type in
whatever was there and it should just work. Walter said it
*would* work.
Steve said yes, that was the whole point. D was matching the C
layout, so it would work. But if you were trying to match a
layout that was in a protocol spec or something, the spec
wouldn't say "use an int here" or "use a long there". It would
just say "use this many bits". So now you had to figure out how
to make the bits line up.
Walter said the beauty of that was that if it ended up being
misaligned, you'd find out immediately. It wouldn't work. Steve
said then you wouldn't know why. Walter said you would. He'd
conformed to external layouts before, and the first thing he did
was test to see that it worked.
Steve said you wouldn't know how to fix it. Say you needed to
align something to 64 bits, so you used a `long` and it didn't
work. What then?
Walter said he'd be happy to add a description of how to do
manual alignment to the bitfields documentation. Then anyone
reading would see a section that said, "If you need to conform to
an external layout, here's how".
Timon said he shouldn't need to figure out how to do the layouts
on the platforms he cared about. It was unnecessary work. He
worried that people who saw the shiny bitfields would assume they
behaved the same way everywhere, would do something that wasn't
sane, and then a problem would crop up in one of his
dependencies. He would have to fork the dependency to make it
work on the platforms he cared about because they differed from
the platforms the other person cared about.
Walter said if you were using code from somebody else who made it
work by using alignments or manually adding padding, then it
would work on your system, too. Timon said he understood that. He
was talking about two separate people working on two separate
dependencies, and one of them screwed up on Windows because they
only cared about Linux. He had Windows users, and now it was his
problem.
Walter said that would be true even with a struct layout that
wasn't anticipated for the system you were building the same code
on. Timon agreed but said that was much less common. Walter said
if you switched between `-m32` and `-m64`, your struct layouts
would differ. That was true on every compiler he'd ever tried it
on.
Timon agreed, but by doing things in a sane way, you'd never run
into this. We should just enforce sanity. He didn't see why this
was so controversial.
Walter said it was controversial because if you wanted the
default in D to do something different than what the C compiler
did, it would be surprising to the user. A few people shook their
heads, including Timon and Steve. Walter said he saw they
disagreed with him, but he would be surprised if it happened to
him.
Steve said he disagreed that this was their position. Walter said
he didn't see how it wasn't their position. The default behavior
should match the associated C compiler and do the same thing. It
was the same with structs now. If you had a struct in C and
wanted to use it in D, you could do that because it matched the
associated C compiler. Steve said that was his position as well.
Adam said he wanted to back up the point I'd made about new users
based on his experience with C#. They had bitfields there that
they called "flags". It was just an enum that you'd stick what
we'd call a UDA onto, and then you could do bitfield operations
on it. People generally didn't have a problem with it. There was
a little bit of documentation on it, but requiring an attribute
to change the layout of an enum wasn't asking a whole lot. Even
people coming to D would understand that a bitfield was something
special to which they'd have to attach an attribute.
Rikki said Nim had three modes for bitfields. One matched the C
compiler, one was the packed thing that we wanted to do, and the
third was it could do whatever it wanted. That was working for
them. He said Rust was getting bitfields as well. He thought they
weren't always going to use the C layout for them.
Walter said the original design for D was that the compiler could
reorder struct members any way it wanted. He'd been specifically
thinking about reordering to eliminate packing holes and things
like that. It soon became obvious that it was just a bad idea. So
he'd implemented it the way C did it and he was done. He'd never
had a single complaint about field alignment in the D compiler.
Not one. And the alignment did change things.
Another reason he'd abandoned reordering was that he'd realized
that people sometimes ordered fields in certain ways for caching.
They wanted their most used field in the hottest part of the
cache. You wanted to group together the stuff you used regularly
and put the rest somewhere else. In that case, you didn't want
the compiler reordering it. If somebody carefully laid out a
struct in C, and then the D compiler came along and reordered it,
that wouldn't go over very well.
At this point, I said we should table this discussion. We had two
scheduled agenda items to get to, and this one had taken up a
significant chunk of time.
(__NOTE__: In a planning session in May, we had agreed that
Walter should make some changes to the bitfields DIP, after which
it would be okay to move forward for a verdict from Átila. Walter
did make some changes to it, but has put it on the agenda for the
December meeting for further discussion.)
### Sumtypes
Rikki said he had a DIP in development which covered the basics
of matching a type. It was three weeks in and was ready for a
pull request. He said he'd give it another week if anyone wanted
to have a look and comment. He said there was something iffy
about how he was supporting `static if`, but it was ready to go.
Second, he talked about [his proposal for a "memberof"
operator](https://github.com/dlang/dmd/pull/16161). It had a
problem regarding argument-to-parameter matching, something that
Walter had voiced strong opinions about in the past. That had
come up again [in the form of type inference in the DIP Ideas
forum](https://forum.dlang.org/thread/zbugncpaooowjsxldzue@forum.dlang.org). Rikki said he was blocked on that and wanted Walter to make a decision about a way forward if there was one.
Walter asked Rikki to write it up so he could fully understand
it. Rikki said he'd simplify it: it was a simple rewrite to
`context.identifier`. Walter said he would have to look into it
and study it. He wasn't that good at thinking on his feet about
something he wasn't using. Rikki said it was already implemented.
Walter asked him to share the PR link and asked if there was any
documentation for it. Rikki said there were comments. Walter
asked if he could also do a changelog entry. Rikki said he would.
Rikki said the main problem with parameter/argument matching was
that it had to call into semantic for verification because he
didn't see another way to do it. He needed to know if it would
have to be rewritten to avoid calling into semantic. Walter said
he understood the question but he didn't understand the problem.
Rikki said that as a feature, it could be extended out to
identity types and a bunch of cool stuff. It would be nice to
have. Over the years, a frequent request was to be able to say,
"Here's my identifier, here's an enum, get the member for me so
that I don't have to write out the enum".
Walter said he would need an explanation of the feature and what
it was for because he didn't know. Since it was something he
wasn't familiar with, he couldn't have an opinion on it right now.
Rikki said that was good. He added that it wasn't just blocking
sumtypes, it was also blocking [value type
exceptions](https://github.com/rikkimax/DIPs/blob/value_type_exceptions/DIPs/DIP1xxx-RC.md) because they needed a zero-size value type to be returned. It was in a register. That was how you got zero-size exception support. Walter said he didn't know what a "zero-size exception" was. Rikki said it was from one of the C++ proposals.
Walter said that C++ had gone so far off the deep end that he was
very wary of doing anything just because C++ was doing it. Rikki
said it was their family of sumtype-based exception handling. It
was all kind of the same thing, just different names. He was
pretty sure that for D we would only need to do what he had
designed.
Steve asked if there was an article or video that explained
sumtype exceptions so that we could understand them. Rikki said
he'd written a DIP. Steve said he didn't need a DIP. He was
asking about something for the layman user. How would they use it
and how would it work? Rikki started to explain what it was
about, but Steve said he was asking for something to look at
outside of the meeting, not to talk about right now.
Walter asked if Rikki was talking about throwing a sumtype value.
Rikki said it was a struct, not a sumtype. It was a sumtype under
the hood. Walter said D didn't throw structs. Rikki said under a
new mechanism it could. It wouldn't be calling out to the runtime
exception mechanism. It would all be through the return of the
function. It would just do it very nicely and hook into try/catch
and throw.
Walter said he wanted to reduce the amount of complexity in
try/catch and throw, not add value types to it. Rikki said it was
a completely different mechanism, something we'd been talking
about for ages. Walter said he was going to have to see a
document on this. C++'s ability to throw values had been a
gigantic design mistake.
Rikki said that was completely different. Mathias suggested that
the value type was just an implementation detail. Rikki said it
sort of was. It would throw your struct. The implementation
detail was that sometimes it would actually be returned from the
function in addition to the return type that you would specify.
Walter said he didn't understand how this had anything to do with
exceptions. Rikki said it used the same syntax and was called an
exception, but was a completely different mechanism.
Walter said this was obviously a complicated thing and he could
not say it was a great idea or a bad idea or anything in between
without taking the time to study its design. He said we could
argue about bitfields all day because he knew it well. He didn't
know anything about this.
Timon said he was pretty sure that Walter had suggested something
like this before in the newsgroups. He expected Walter just
didn't understand what Rikki was referring to. Walter said that
might be entirely true.
Dennis asked if the idea was that you'd be returning a
traditional error code, but in this case, the compiler would
rewrite throw and catch to check the error code. Mathias said
that was his understanding, too. Rikki said you could pass
whatever user data you wanted.
Walter asked why you couldn't just use return, then. Dennis said
you could forget to check it. It was also error-prone. Walter
said okay, he just needed more information. He couldn't figure it
out and think about it with just a conversation. He asked Rikki
to please write a document, then he would study it and get back
to him.
Mathias asked if we did that without using the exception ABI,
wouldn't it break the ability to use exceptions between D and
C++? Rikki said it wouldn't. One was a class which used the
existing mechanism, one was a struct which didn't.
Walter said he'd given up on catching C++ exceptions in D on
Windows. It was undocumented by Microsoft and too complicated, so
he'd just abandoned it. It worked on 32-bit Windows because it
was documented. Mircrosoft claimed they had documented it for
64-bit, too, but they hadn't. It was a giant mess. He would have
to spend a great deal of time reverse engineering it, and he
didn't want to spend the time on it. So D had its own exception
handling mechanism on 64-bit Windows.
### Returning variable-sized stack allocations from a function
Aya said she'd been wondering if it were possible for the
language to return a variable-sized stack allocation from a
function right now. Walter said it was not. Aya thought we should
have some syntactic sugar for it. There was a way to do it, but
it was clunky.
Walter said that normally when you returned variable-sized
things, you did it by reference. Aya said you would have to
allocate at the call site and pass the function a pointer to the
memory for it to populate. That would effectively be a stack
return.
Walter said that would work, but another thing was that we didn't
want to step away from the C ABI. He'd tried that before with D
and it had been a disaster. We had to conform to the C ABI
because not only did everybody follow it, the debugger wouldn't
work with anything but the C standard ABI. Even though the
documentation said it would, it would not. Apparently, no one
ever tested the debugger with a non-C calling convention.
Inventing our own would not work.
However, the C ABI did say that if you were passing a large
struct, a pointer to it would be passed instead and then the code
generator would fill it in. That was just a variation on the idea
of return by reference.
Aya said she was thinking about smaller allocations in general.
It was possible to call a function that returned the size of the
allocation. Then maybe you'd need to store what you were going to
allocate in intermediate storage, which then would need to be
allocated. So you would either need to allocate twice anyway or
call twice all of the logic that determined the size of the
return value. Her main thought was that she didn't know if there
was a nice way to simplify it.
Walter said it would have to be done by reference one way or
another. Aya agreed.
Timon said this was just the named return value optimization for
a type with `sizeof` that was a dynamic value. He thought it
would be a nice thing to have, but a bit tricky to build.
Aya said she'd also been thinking about struct interfaces. I
mentioned Atila's library. Mathias asked why Aya couldn't use
classes. Walter connected that back to the variable-sized
allocations, saying that classes were variable-sized objects
passed by reference. Maybe Aya could think about using classes to
implement the idea.
Aya said the problem was when you had a very small struct you
wanted to have an interface for and a lot of small structs. You'd
just have a lot of small classes implementing the same interface
with lots of heap allocations and fragmentation for no reason.
Walter suggested using COM classes. They were simpler and
smaller. Aya asked if they were portable. Walter said they were.
They were meant to match the COM interface in Windows. The
compiler support was there. The smallest COM object only
consisted of a pointer. Googling would turn up an explanation of
them. They were pretty clever and a nice feature. Aya said she
would look into it.
While the rest of us discussed the next agenda item, Aya did some
searching about COM, so we came back to her.
She asked if `IUnknown` was required to work with COM interfaces.
Walter said it was. Aya said that was behind a
`version(Windows)`. Walter said we could remove it. Rikki said it
could be used without `IUknown`. It could work with anything.
Adam said several things in D were improperly versioned to
Windows, so it shouldn't be taken as gospel. He'd encountered it
with ODBC support, something he'd been using on Linux. In those
cases, it should be okay to just PR it away. He could guarantee
that COM worked on Linux and on Mac. It was just a C ABI thing.
Walter noted that Microsoft's COM interface relied on the
existance of the functions `QueryInterface`, `Add`, and
`Release`. If you didn't inherit from `IUnknown`, you wouldn't
get those. Adam said that was correct. You could do it without
those, but you'd have to recreate the functionality.
Dennis said that Vladimir had wanted to expose some Windows
bitmap-related structs, and as he recalled Walter and Jonathan
had been opposed to it because it would be a bug to import
Windows stuff on non-Windows. Adam said in this case COM itself
was not Windows-specific. Walter brought up the GUID thing in COM
and thought that was Windows-specific.
Adam said that was true. The struct itself could probably be
exposed because there was nothing specific to Windows in that.
Walter said you would need a way to get a GUID. Adam asked if we
could use UUID and use Microsoft's methodology. Walter said yes,
but something would have to be done to make that work.
Adam said his general point was that COM was just a C ABI trick.
Windows had a bunch of support for it that probably wasn't
available elsewhere. If we wanted to make the effort to recreate
it in our own environment, that might work. He thought we didn't
need the interface query part of it.
After this, there was a bit of discussion to clarify some
confusion Aya had about the spec. In the end, she said she was
good.
### An update from Walter
Walter said he'd been working on two things.
First, we'd had two major requests to get move constructors
working, so it was a priority. He would divide his time between
move constructors and ARM code generation. Weka had wanted the
feature for a long time. They were a major user, so we needed to
work on it.
Second, the PR for the ARM backend was a diff of over 5000 LOC
and was practically unreviewable. He'd talked with Dennis about
merging it even though it wasn't yet a functioning backend. He
continually rebased it, but it was hard for anyone to follow his
progress because he kept adding to this gigantic PR. He thought
it would be better if it were mainlined.
Walter said that Dennis had brought up testing it. Running it
through the test suite would not work, because the test suite
required a fully functioning compiler. Adding an ARM target to
the test suite would probably be necessary in the future. He
didn't know how that might work. He didn't know much about how
the test suite was set up.
Rikki said as long as the ARM backend wasn't exposed and none of
the pathways were called into, then merge it. Walter said he'd
put it behind a switch and he didn't think it would affect
anyone's code. Rikki said he should document the switch, but
otherwise it sounded fine to merge.
## Conclusion
Before we left, I gave an update on the planning for DConf '24
and some potential hiccups (which thankfully didn't come to
pass), and Rikki said that he and Steve had been looking into how
other languages handled coroutines and had concluded that
stackless was the way to go. Then we called it.
Our next monthly meeting took place on August 9th at 15:00 UTC.
If you have something you'd like to discuss with us in one of our
monthly meetings, feel free to reach out to me and let me know.
More information about the Digitalmars-d-announce
mailing list