third draft: add bitfields to D

Quirin Schroll qs.il.paperinik at gmail.com
Fri Jul 26 00:08:12 UTC 2024


On Wednesday, 24 July 2024 at 08:57:40 UTC, IchorDev wrote:
> On Sunday, 5 May 2024 at 23:08:29 UTC, Walter Bright wrote:
>> https://github.com/WalterBright/documents/blob/2ec9c5966dccc213a2c4c736a6783d77c255403a/bitfields.md
>
> I’ve spent a long time observing the debate about bitfields in 
> D, so now it’s time for me to give my feedback.
>
> ## Criticism of Bitfields
> Bitfields are an incredibly bare-bones feature that address 
> only a small subset of the difficulties of managing bit-packed 
> data, are difficult to expand upon, and are arbitrarily 
> relegated to a field of an aggregate for maximum inconvenience. 
> The DIP itself even points out that ‘many alternatives are 
> available’ for situations where bitfields aren’t an appropriate 
> solution. This serves as an admission that bitfields are not a 
> very useful feature outside of C interoperability, because 
> programmers expect and want structs to be laid out how they 
> choose, not in some arbitrary way that’s compatible with the 
> conventions of C compilers. You cannot choose how 
> under/overflow are handled, have different types for different 
> collections of bits in the field safely, or even construct a 
> bitfield on its own unless it is wrapped in a dummy struct. I 
> think if we add this version of bitfields to mainline D, then 
> it should only be as a C interoperability feature.
>
> ## How to Improve
> If we want to add a bitfield equivalent to D, let’s make it 
> better than a bitfield in every possible way: let’s make it an 
> aggregate type. I’ll call it ‘bitwise’ as an example:
> ```
> bitwise Flavour: 4{
>   bool: 1 sweet, savoury, bitter;
> }
>
> bitwise Col16: ushort{
>   uint: 5 red;
>   uint: 6 green;
>   uint: 5 blue;
> }
> ```
> Here it is slightly modified with some comments so you can 
> understand what’s going on:
> ```d
> bitwise Flavour: 4{ //size is 4 bytes. Without specifying this, 
> the type would be 1 byte because its contents only take 1 byte
>   bool: 1 sweet; //1 bit that is interpreted as bool
>   //default values for fields, and listing multiple 
> comma-separated fields:
>   bool: 1 savoury = true, bitter;
> }
>
> bitwise Col16: ushort{ //You can implicitly cast this type to a 
> ushort, so it should be 2 bytes at most
>   uint: 5 red; //5 bits that are interpreted as uint
>   uint: 6 green;
>   uint: 5 blue;
> }
> ```
> ### How is this better than a bitfield?
> Because it’s a type it can be easily referenced, passed to 
> functions without a dummy struct, we can have template pattern 
> matching for them, they can be re-used across structs, can be 
> given constructors & operator overloads (e.g. for custom 
> floats), and can have different ways of handling overflow:
> ```d
> bitwise Example{
>   ubyte: 1 a;
> //assigning 10 to a: 10 & a.max
> //(where a.max is 1 in this case)
>   @clamped byte: 2 b;
> //assigning 10 to b: clamp(10, signExtend!(b.bitwidth)(b.min), 
> b.max);
> //(where b.min/max would be -2/1 in this case)
> }
> ```
> But what about C interoperability? Okay, add `extern(C)` to an 
> anonymous bitwise and it’ll become a C-interoperable bitfield:
> ```d
> struct S{
>   extern(C) bitwise: uint{
>     bool: 1 isnothrow, isnogc, isproperty, isref, isreturn, 
> isscope, isreturninferred, Isscopeinferred, inference, islive, 
> incomplete, inoutParam, inoutQual;
>     uint: 5 a;
>     uint: 3 flags;
>   }
> }
> ```
> I think this approach gives us much more room to make this a 
> useful & versatile feature that can be expanded to meet various 
> needs and fit various use-cases.

I like it. The only thing that’s odd to me is `int: 21 foo, bar`. 
It looks much more like `21` is in line with `foo` and `bar`, but 
it’s to be read as `int: 21` `foo` `bar`. We could agree to use 
no space, i.e. `int:21`, or use something other than `:`, e.g. 
`int{21}`. That looks much more like a type and `int{21} foo, 
bar` looks much more like a list of variables declared to be of 
some type. Essentially, `int[n]` is a static array of `n` values 
of type `int`, whereas `int{n}` is a single `n`-bit signed value. 
As per `int`, `n` is 32 or lower. For `bool` only `bool{1}` would 
be possible (maybe also `bool{0}`, if we allow 0-length bitfields.

In general, `extern(D)` bitfields should be allowed to be 
optimized. An attribute like `@packed` could indicate that the 
layout be exactly as specified. And `extern(C)` can give C 
compatibility.


More information about the dip.development mailing list