foreach and metaprogramming

Wed Nov 8 07:27:37 PST 2006

> I'm from the speed is good camp. I like D because it seems to give good speed
> while still being easy to use.

(Sorry for the long post.) I don't think speed needs to be a problem if 
you implement it as a compile-time language for lexical-aware macros. 
Here is one potential way of doing this, to show you what I'm thinking 
about:

This would consist of four different language features.

The first one is just syntactic sugar and should be fairly easy to 
implement: Ruby-style blocks. It could be implemented as just sending a 
delegate as the last function argument:

foo(1) { writefln("Hello"); }
foo(1, { writefln("Hello"); });

Would be the same. I don't know if this is hard to do, but I can't see 
why it would. And it makes for some very elegant things at other places 
as well, look at Ruby for examples.

The second one is two new operators, I'll call them $ and #. $ and # can 
be put before expressions, statements, declarations and identifiers. 
Examples are: $int a = 5; ${ /* code block */ } class $Name {}. It binds 
very tightly, you should be able to assume that only the thing next to 
it is bound.

# returns the AST of the following expression, $ is for declaring macro 
blocks.

$ blocks may return nothing, and then they will be replaced by nothing: 
${int a = 5;}; gives nothing. They can also return literals, e.g. int a 
= ${return 5;}; Or, they can return AST objects, and then they will be 
replaced by that tree:

${ return #{if (a == 5) die();}; } // This is equal to if (a == 5) die();

The argument to # must be a complete expression/declaration/statement, 
so #{ if (a == } is invalid. The only operation you can do with ASTs 
created with # is to concatenate with the ~ operator.

I think you could implement this by making all $ blocks functions in a 
separate "meta"-module (used only internally when compiling). So if you 
define a function within a macro block, you define a function in the 
"meta"-module, essentially defining a macro that can be invoked with

${return macroname(args);}

Given this, most of the features of this can be done, but I'd suggest 
adding some more syntax sugar, a macro keyword:

macro foreach;

This declares that all foreach calls should be surrounded within 
${return ;} and all arguments enclosed in #{}. So you will be able to 
access the compile-time function foreach as though it was a normal 
function. There are one more thing to add:

macro int hello(int a) {
   return a;
}

is equivalent to
${ int hello(int a) { return a; } };
macro hello;

The last feature that might be added is a library that only compile-time 
code can access that allows you to reflect over classes and functions. 
(Probably only in the same module as the calling code, but I don't think 
that's a big limitation)

As an example I will show how you can implement a nifty kind of 
iterators using this technique.

class LinkedList {
   // ...
   macro each(void delegate(Type t) block) {
     return #{
       while (iterate over the list) {
          block(data);
       }
     };
   }
   // ...
}

LinkedList ll = new LinkedList;
// Add data..
// I borrowed Ruby's syntax. I don't really like it but it works as demo
ll.each { |Type data|
   writefln(data);
}

(This example obviously requires the macro stuff to be done *after* 
templates, but I think that makes sense.)

It can't be hard for the compiler to recognize inline delegate calls, so 
this code should be able to get expanded into no method calls at all.

To sum this all up:

What I'm talking about is really only a slightly more hygienic and a lot 
more powerful macro system. I can't see why this would reduce 
performance, but I can see how this can be used to perform heavy 
high-level optimization (for example when you have a tree that doesn't 
map very well to an index operator or C++-iterators). And it would let 
you do lots of other cool things as well, like automated persistence 
layers without another DSL and things like that.

The great drawback I can see is that this would give much longer 
compilation times. But that's the cost of doing more at compile time and 
less at runtime, isn't it? (And you should be able to optimize it some, 
as well)

One thing that is good about this is that you expose very few compiler 
internals; only create and append AST is needed, and the rest can be 
implemented as a normal (compile-time) API.

/Per