Extending D to support per-class or per-instance user-defined metadata

Jean-Louis Leroy jl at leroy.nyc
Mon Dec 11 20:08:02 UTC 2017


I just had a discussion with Walter, Andrei and Ali about open 
methods. While Andrei is not a great fan of open methods, he 
likes the idea of improving D to better support libraries that 
extend the language - of which my openmethods library is just an 
example. Andrei, correct me if I misrepresented your opinion in 
this paragraph.

Part of the discussion was about a mechanism to add user-defined 
per-object or per-class metadata (there's another part that I 
will discuss in another thread).

Andrei's initial suggestion is to put it in the vtable. If we 
know the initial size of the vtable, we can grow it to 
accommodate new slots. In fact we can already do something along 
those lines...sort of:

import std.stdio;

class Foo {
   abstract void report();
}

class Bar : Foo {
   override void report() { writeln("I'm fine!"); }
}

void main() {
   void*[] newVtbl;
   auto initVtblSize = Bar.classinfo.vtbl.length;
   newVtbl.length = initVtblSize + 1;
   newVtbl[0..initVtblSize] = Bar.classinfo.vtbl[];
   newVtbl[initVtblSize] = cast(void*) 0x123456;
   byte[] newInit = Bar.classinfo.m_init.dup;
   *cast(void***) newInit.ptr = newVtbl.ptr;
   Bar.classinfo.m_init = newInit;
   Foo foo = new Bar();
   foo.report(); // I'm fine!
   writeln((*cast(void***)foo)[initVtblSize]); // 123456
}

This works with dmd and gdc, not with ldc2. But it gives an idea 
of what the extension would like.

A variant of the idea is to allocate the user slots *before* the 
vtable and access them via negative indices. It would be faster.

Of course we would need a thread safe facility that libraries 
would call to obtain (and release) slots in the extended vtable, 
and return the index of the allocated slot(s). Thus a library 
would call an API to (globally) reserve a new slot; then another 
one to grow the vtable of the classes it targets (automatically 
finding and growing all the vtables is unfeasible because nested 
classes are not locatable via ModuleInfo).

Walter also reminded me of the __monitor field so I played with 
it too. Here is prototype of what per-instance user defined slots 
could look like.

import std.stdio;

class Foo {
}

void main() {
   byte[] init;
   init.length = Foo.classinfo.m_init.length;
   init[] = Foo.classinfo.m_init[];
   (cast(void**) init.ptr)[1] = cast(void*) 0x1234;
   Foo.classinfo.m_init = init;
   Foo foo = new Foo();
   writeln((cast(void**) foo)[1]); // 1234 with dmd and gdc, null 
with ldc2
}

This works with dmd and gdc but not with ldc2.

This may be useful for implementing reference-counting schemes, 
Observers, etc.

In both cases I use the undocumented 'm_init' field in ClassInfo. 
The books and docs do talk about the 'init' field that is used to 
initialize structs, but I have found no mention of 'm_init' for 
classes. Perhaps we could document it and make it mandatory that 
an implementation uses its content to pre-initialize objects.

Also here I am using the space reserved for the '__monitor' 
hidden field. This is a problem because 1/ it will go away some 
day 2/ it is only one word. Granted, that word could store a 
pointer to a vector of words, where user-defined slots would 
live; but that would be at the cost of performance.

Finally, note that if you have per-instance user slots and a way 
of automatically initializing them when an object is created, 
then you also have per-class user-defined metadata: just allocate 
a slot in the object, and put a pointer to the data in it.

Please send in comments, especially if you are a library author 
and have encountered a need for this kind of thing. Eventually 
the discussion may lead to the drafting of a DIP.





More information about the Digitalmars-d mailing list