second draft: add Bitfields to D

Mon Apr 29 06:44:08 UTC 2024

On 4/28/2024 3:30 PM, Timon Gehr wrote:
> However, I would still much prefer a solution that explicitly introduces the 
> underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones 
> visible to introspection in a portable way, so that introspection code does not 
> really need to concern itself with bitfields at all if it is not important and 
> we do not break existing introspection libraries, such as all serialization 
> libraries.

I doubt introspection libraries would break. If they are not checking for 
bitfields, but are just looking at .offsetof and the type, they'll interpret the 
bitfields as a union (which, in a way, is accurate).

>> Symbolic Debug Info
> 
> This does not seem like a strong argument. I am pretty confident debug info can 
> work pretty well regardless of how D lays out the bits.

I'm not. I'd follow the dwarf spec and it didn't work, because the only thing 
that was ever tested was apparently what the C compiler actually did. In order 
to get gdb to work, I wound up ignoring the spec and doing what gcc did. It's 
the same with object file formats. The spec is somewhat of a fairy tale, it's 
what the associated C compiler actually does that matters.

> I like that the members are not as cluttered. I guess maybe some people still 
> would like to access the underlying data (e.g., to implement a pointer to 
> bitfield as a struct with a pointer plus bit offset and bit length, or 
> something), so perhaps you could add a note that explains how to do that.

Pointer to bitfields will work just the same as they do in C. I don't understand 
what you're asking for.

> You forgot to say what `.tupleof` will do for a struct with bitfields in it.

They do exactly what you'd expect them to do:

```
import std.stdio;
struct S { int a:4, b:5; }
void main()
{
     S s;
     s.a = 7;
     s.b = 9;
     writeln(s.tupleof);
}
```
prints:
```
79
```
It's not necessary to specify this, because this behavior does not diverge from 
field access semantics. Only things that differ need to be specified. Specifying 
"it works like X except for A,B,C" is a lot more reliable and compact than 
reiterating everything X does.

> I think it would be better to have such a `__traits` even just for 
> discoverability when people look at the `__traits` page to implement some 
> introspection code.

There isn't for other members, it's just "allMembers".

>> testing to see if the address of a field can be taken, enables discovery of a 
>> bitfield.
> 
> Not really, a field could be an `enum` field, and you cannot take the address of 
> that either. And if we ever add another feature that has fields whose address 
> can be taken, existing introspection code may break. It is better to be explicit.

An enum is distinguished by it not being possible to use .offsetof with it.

>> The values of .max or .min enable determining the number of bits in a bitfield.
> I do not like this a lot, it does not seem like the canonical way to determine 
> it. `.bitlength`?

I agree it's a bit(!) jarring at first blush, but it's easy and perfectly 
reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of 
introspection via indirect things like this.

>> The bit offset can be introspected by summing the number of bits in each 
>> preceding bitfield that has the same value of .offsetof.
> 
> I think it would be much better to just add a `__trait` for this or add 
> something like `.bitoffsetof`. This is a) much more user friendly and b) is a 
> bit more likely to work reliably in practice. D currently does not give any 
> guarantees on the order you will see members when using `__traits(allMembers, 
> ...)`.

I overlooked that bitfields can have holes in them, so probably something like 
.bitoffsetof is probably necessary.