Optimizing delegates

Sun Dec 19 07:32:00 PST 2010

I have this code:

---
import std.stdio;

int foobar(int delegate(int) f) {
   return f(1);
}

int foobar2(string s)() {
   int x = 1;
   mixin("return " ~ s ~ ";");
}

void main() {
   writefln("%d", foobar((int x) { return 2*x; }));
   writefln("%d", foobar2!("9876*x"));
}
---

When I compile it with -O -inline I can see with obj2asm that for the first writefln the delegate is being called. However, for the second it just passes
9876 to writefln.

>From this I can say many things:
 - It seems that if I want hyper-high performance in my code I must use string mixins because delegate calls, even if they are very simple and the
functions that uses them are also very simple, are not inlined. This has the drawback that each call to foobar2 with a different string will generate a
different method in the object file.
 - When using string mixins, as mentioned by several others, if I invoke a function, that function is evaluated in the scope of foobar2, not in the
original scope. That limits a lot what I can pass a string.
 - That means that if I want my users to use either strings or delegates I have to take that into account, complicating a lot the function signature and
the implementation.
 - That also means I can't always have hyper-high performance because if I need to do something but I can't do it with a string I have to rely on
delegates, which won't be optimized.
 - Understanding foobar2 is harder than understanding foobar. First, it has two statements. :-P. Then I have to concatenate the strings in my head to form
the resulting function. And that one is a simple one, so for a complex one it must be even harder to understand.

>From all of these points I can see string mixins are not a very good idea to be used as callback functions, the main reasons being: it doesn't always work,
it's harder to read and to write and it makes the object file bigger.

So my suggestion is: optimize delegates! Take some time to optimize them to the maximum and then you won't need string mixins. If foobar here can be
optimized like foobar2, that's it! Why would anyone want to write a function like foobar2 if the resulting code will be the same?

So here's an idea, it just ocurred to me while writing this post. In the code above, the compiler sees this:

writefln("%d", foobar((int x) { return 2*x; }));

foobar is this:

int foobar(int delegate(int) f) {
   return f(1);
}

It sees foobar takes a delegate, so what it does it creates a function foobar_1 (or some name not already used) that results from rewriting foobar with the
delegate inlined:

int foobar_1(int delegate(int) f) {
   return 2*1;
}

So now we have:

writefln("%d", foobar_1((int x) { return 2*x; }));

It then optimizes that line and ends up with this:

writefln("%d", 2);

(and I checked if dmd inlines that call, because it passes a delegate it doesn't use, and yes, it is being inlined).

Then the compiler can discard foobar_1 because it doesn't use it anymore. No object file bloat, full speed.

What if the delegate is more complex? Let's try:

writefln("%d", foobar((int x) { int a = 10; return a*x; }));

int foobar_1(int delegate(int) f) {
   int delegateResult;
   {
     int a = 10;
     delegateResult = a*1; // 1 was x
   }
   return delegateResult;
}

More complex:

int bar(int y) {
  return y * 3;
}
writefln("%d", foobar((int x) { return bar(x); }));

For this case... I don't know. Maybe it could define bar inside of foobar_1 again, or maybe rewrite foobar_1 to accept another delegate and pass bar to it,
and then reapply this algorithm.

Anyhow... maybe what I say doesn't make sense at all, it's the wrong way to optimize code or whatever. My point is: if the compiler were smarter I wouldn't
need string mixins to get my code optimized. If you are going to define fancy functions like map, sort, and other functions that basically receive callback
functions, optimize those callback to the maximum, don't make the users optimize those callbacks by using string mixins. It's very frustrating for the
user, when she has to sit down and write a high-performance function, to think "Ok, how can I write this code so the compiler already sees it as being very
easy to optimize?". With D it's not about thinking about the data-structures to use and the algorithms to use anymore, it's about thinking about writing
code that looks easy to optimize for the compiler. :-(