Application Binary Interface's a refresher of how D works

Richard Andrew Cattermole (Rikki) richard at cattermole.co.nz
Tue Jan 20 12:11:30 UTC 2026


The N.G. has been a bit light on posts lately, so I thought I'd 
write something up that might drum up some interest.

So let's talk about Application Binary Interface's (ABI's).

A very general take on what an ABI is, is whatever information is 
required to lower a programming language down to assembly 
instructions and what memory is mapped to.
A simplestic example of this, is what registers do function 
arguments go in?

Generally you as the programmer of D, never need to know or care 
about the ABI of something.
There are some exceptions to this, like with bitfields.
You must know that if a bitfield crosses between storages (first 
bitfield determines the size), it'll add padding to get it all in 
one storage unit.

If you have ever written a binding to say a C library you would 
have encountered dealing with ABI's, by selecting a calling 
convention for functions.
Using the syntax ``extern(X) void func();`` or ``alias Func = 
extern(X) void function();``.
Where the calling convention selected was either ``C`` or 
``Windows``.

The same syntax used for selecting the calling convention, 
usually selects the ABI as well.
There is an exception to this, but I'll come back to that later.

### Types and Layout Invariants

Types have an ABI too!
If you specify that a struct has the ABI ``C``:

```d
extern(C) struct C {}
struct D {}
static assert(C.sizeof == 0);
static assert(D.sizeof == 1);
```

The ABI of a type does a lot more than just change the size of a 
struct, it also determines the behavior of a class, and its 
layout.
A good example of this is a C++ class.

C++ classes have some limitations against D classes, and lack a D 
type info so down casting cannot be implemented safely.
You can paint a reference to a C++ class as a child, but there 
are no checks to make sure that it is the case.

```d
extern(C++) class Parent {
     void method() {}
}

class Child : Parent {}

void main() {
     Parent p = new Parent;
     Child c = cast(Child)p;
}
```

Codegens out to:

```
_Dmain:
.Lfunc_begin1:
         .loc    1 7 0 is_stmt 1
         .cfi_startproc
         push    rbp
         .cfi_def_cfa_offset 16
         .cfi_offset rbp, -16
         mov     rbp, rsp
         .cfi_def_cfa_register rbp
         sub     rsp, 16
.Ltmp2:
         .loc    1 8 5 prologue_end
         mov     rdi, qword ptr [rip + 
_D7example6Parent7__ClassZ at GOTPCREL]
         call    _d_allocclass at PLT
         mov     rcx, qword ptr [rip + 
_D7example6Parent6__vtblZ at GOTPCREL]
         mov     qword ptr [rax], rcx
         mov     qword ptr [rbp - 8], rax
         .loc    1 9 5
         mov     rax, qword ptr [rbp - 8]
         mov     qword ptr [rbp - 16], rax
         .loc    1 10 1
         xor     eax, eax
         .loc    1 10 1 epilogue_begin is_stmt 0
         add     rsp, 16
         pop     rbp
         .cfi_def_cfa rsp, 8
         ret
```

See how there is only one function call, to ``_d_allocclass``?
Yeah D classes don't do this, they call out to down cast safely.

### Class Layout

See D classes are comprised of two things in its object layout, a 
header and the user field layouts.
That header is super important, it includes the type info, 
vtables, and the monitor.
The ABI determines what goes into the header, and what layout is 
in active use, as well as how the reference to that class works.

This is specifically important to understand when it comes to the 
class hierarchy, of [Contravariant and 
Covariant's](https://en.wikipedia.org/wiki/Type_variance).
For classes this is the basis for how the class hierarchy operate.
If the [top](https://en.wikipedia.org/wiki/Any_type) class does 
X, so does the children.
Doing something different in a child, breaks the class hierarchy.

This is important to understand for editions, fundamentally the 
top class will determine the edition, not the child class.
So if the top class is D2 legacy code, then the child class will 
have a monitor and be able to synchronize on it.

The layout of a class cannot be changed by edition alone, doing 
so would put fields at different offsets and cause program 
corruption in the best case scenario, in the worst case scenario 
it can have unknown side effects during runtime.

In practical terms we are not that limited, for editions.
We can swap out the top class, or change behavior, like how a 
synchronized statement with no argument could be made to lock on 
the this pointer, instead of generating a new monitor for that 
function in a class.
Or by removing synchronized as a storage class on a class 
declaration.

### COM classes, the exception

Remember how earlier I said that there was an exception to the 
rule of calling conventions?
Yeah so its for classes, specifically COM classes.

Somebody at some point in time decided that if an interface is 
called ``IUnknown``, then any class that implements this 
interface, must be a COM class, hooray!
Wouldn't it have been better to give COM classes their own ABI 
and mark it with ``extern(COM)``? Yeah absolutely.
Much more consistent.
But that isn't the way of Microsoft.

This has had an interesting implication, see when the hook to 
``new`` a class was converted to a template, they changed the 
``TypeInfo`` check to see if it was a COM class, with a check to 
see if the calling convention was ``Windows``.
Except that isn't how the compiler understood it.
The compiler has a field called ``com`` and a method called 
``isCOMclass`` that gets it.
So for a while there the hook was broken, at least until I added 
a new trait called ``isCOMClass`` and updated the hook to use it.

### Down casting

Finally I want to cap this off with a detail about how D classes 
work when it comes to down casting.
To [down cast 
interfaces](https://github.com/dlang/dmd/blob/0847658a9aed377d3cd149243c1170dad42295cb/druntime/src/core/internal/cast_.d#L115), it first will cast the interface to the top type, which is ``Object``.
A D2 legacy edition class.

The top type ``Object`` is critical to the way D class, work, I 
am convinced that without a tremendous amount of work, 
``ProtoObject`` class could never have worked.
Too much depends on ``Object`` being the top type for D classes.
We are not completely screwed here, we can swap the top type, but 
its going to have to be a 1:1 swap with another class, without 
messing around with ``Object``.
It must always be called ``object.Object`` as well, so thats 
going to need to be hard coded if its ever moved.



More information about the Digitalmars-d mailing list