Compile-time reflection

Sun Jul 1 15:59:40 PDT 2007

The subject of compile-time reflection has been an important one to me. 
I have been musing on it since about the time I started writing Pyd. 
Here is the current state of my thoughts on the matter.

----
Functions
----

When talking about functions, a given symbol may refer to multiple 
functions:

void foo() {}
void foo(int i) {}
void foo(int i, int j, int k=20) {}

The first thing a compile-time reflection mechanism needs is a way to, 
given a symbol, derive a tuple of the signatures of the function 
overloads. There is no immediately obvious syntax for this.

The is() expression has so far been the catch-all location for many of 
D's reflection capabilities. However, is() operates on types, not 
arbitrary symbols.

A property is more promising. Re-using the .tupleof property is one idea:

foo.tupleof => Tuple!(void function(), void function(int), void 
function(int, int, int))

However, I am not sure how plausible it is to have a property on a 
symbol like this. Another alternative is to have some keyword act as a 
function (as typeof and typeid do, for instance). I propose adding 
"tupleof" as an actual keyword:

tupleof(foo) => Tuple!(void function(), void function(int), void 
function(int, int, int))

I will be using this syntax throughout the rest of this post. For the 
sake of consistency, tupleof(Foo) should do what Foo.tupleof does now.

To umabiguously refer to a specific overload of a function, two pieces 
of information are required: The function's symbol, and the signature of 
the overload. When doing compile-time reflection, one is typically 
working with one specific overload at a time. While a function pointer 
does refer to one specific overload, it is important to note that 
function pointers are not compile-time entities! Therefore, the 
following idiom is common:

template UseFunction(alias func, func_t) {}

That is, any given template that does something with a function requires 
both the function's symbol and the signature of the particular overload 
to operate on to be useful.

It should be clear, then, that automatically deriving the overloads of a 
given function is very important. Another piece of information that is 
useful is whether a given function has default arguments, and how many. 
The tupleof() syntax can be re-used for this:

tupleof(foo, void function(int, int, int)) => Tuple!(void function(int, 
int))

Here, we pass tupleof() the symbol of a function, and the signature of a 
particular overload of that function. The result is a tuple of the 
various signatures it is valid to call the overload with, ignoring the 
/actual/ signature of the function. The most useful piece of information 
here is the /number/ of elements in the tuple, which will be equal to 
the number of default arguments supported by the overload.

One might be tempted to place these additional function signatures in 
the original tuple derived by tupleof(foo). However, this is not 
desirable. Consider: We can say any of the following:

void function() fn1 = &foo;
void function(int) fn2 = &foo;
void function(int, int, int) fn3 = &foo;

But we /cannot/ say this:

void function(int, int) fn4 = &foo; // ERROR!

A given function-symbol therefore has two sets of function signatures 
associated with it: The actual signatures of the functions, and the 
additional signatures it may be called with due to default arguments. 
These two sets are not equal in status, and should not be treated as such.

----
Member functions
----

Here is where things get really complicated.

class A {
     void bar() {}
     void bar(int i) {}
     void bar(int i, int j, int k=20) {}

     void baz(real r) {}

     static void foobar() {}
     final void foobaz() {}
}

class B : A {
     void foo() {}
     override void baz(real r) {}
}

D does not really have pointers to member functions. It is possible to 
fake them with some delegate trickery. In particular, there is no way to 
directly call an alias of a member function. This is important, as I 
will get to later.

The first mechanism needed is a way to get all of the member functions 
of a class. I suggest the addition of a .methodsof class property, which 
will derive a tuple of aliases of the class's member functions.

A.methodsof => Tuple!(A.bar, A.baz, A.foobar, A.foobaz)
B.methodsof => Tuple!(A.bar, A.foobar, A.foobaz, B.foo, B.baz)

The order of the members in this tuple is not important. Inherited 
member functions are included, as well. Note that these are tuples of 
symbol aliases! Since these are function symbols, all of the mechanisms 
suggested earlier for regular function symbols should still work!

tupleof(A.bar) => Tuple!(void function(), void function(int), void 
function(int, int, int))

And so forth.

There are three kinds of member functions: virtual, static, and final. 
The next important mechanism that is needed is a way to distinguish 
these from each other. An important rule of function overloading works 
in our favor, here: A given function symbol can only refer to functions 
which are all virtual, all static, or all final. Therefore, this should 
be considered a property of the symbol, as opposed to one of the 
function itself.

The actual syntax for this mechanism needs to be determined. D has 
'static' and 'final' keywords, but no 'virtual' keyword. Additionally, 
the 'static' keyword has been overloaded with many meanings, and I 
hesitate suggesting we add another. Nonetheless, I do.

static(A.bar == static) == false
static(A.bar == final) == false
static(A.bar == virtual) == true

The syntax is derived from that of the is() expression. The grammar 
would look something like this:

StaticExpression:
	static ( Symbol == SymbolSpecialization )

SymbolSpecialization:
	static
	final
	virtual

Here, 'virtual' is a context-sensitive keyword, not unlike the 'exit' in 
'scope(exit)'. If the Symbol is not a member function, it is an error.

A hole presents itself in this scheme. We can get all of the function 
symbols of a class's member functions. From these, we can get the 
signatures of their overloads. From /these/, can get get pointers to the 
member functions, do some delegate trickery, and actually call them. 
This is all well and good.

But there is a problem when a method has default arguments. As explained 
earlier, we can't do this:

// Error! None of the overloads match!
void function(int, int) member_func = &A.bar;

Even though we can say:

A a = new A;
a.bar(1, 2);

The simplest solution is to introduce some way to call an alias of a 
method directly. There are a few options. My favorite is to take a cue 
from Python, and allow the following:

alias A.bar fn;
A a = new A;
fn(a, 1, 2);

That is, allow the user to explicitly call the method with the instance 
as the first parameter. This should be allowed generally, as in:

A.bar(a);
A.baz(a, 5.5);

Given these mechanisms, combined with the existing mechanisms to derive 
the return type and parameter type tuple from a function type, D's 
compile-time reflection capabilities would be vastly more powerful.

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org