Optimizing delegates

Sun Dec 19 08:15:00 PST 2010

On 2010-12-19 16:32, Ary Borenszweig wrote:
> I have this code:
>
> ---
> import std.stdio;
>
> int foobar(int delegate(int) f) {
>     return f(1);
> }
>
> int foobar2(string s)() {
>     int x = 1;
>     mixin("return " ~ s ~ ";");
> }
>
> void main() {
>     writefln("%d", foobar((int x) { return 2*x; }));
>     writefln("%d", foobar2!("9876*x"));
> }
> ---
>
> When I compile it with -O -inline I can see with obj2asm that for the first writefln the delegate is being called. However, for the second it just passes
> 9876 to writefln.
>
>  From this I can say many things:
>   - It seems that if I want hyper-high performance in my code I must use string mixins because delegate calls, even if they are very simple and the
> functions that uses them are also very simple, are not inlined. This has the drawback that each call to foobar2 with a different string will generate a
> different method in the object file.
>   - When using string mixins, as mentioned by several others, if I invoke a function, that function is evaluated in the scope of foobar2, not in the
> original scope. That limits a lot what I can pass a string.
>   - That means that if I want my users to use either strings or delegates I have to take that into account, complicating a lot the function signature and
> the implementation.
>   - That also means I can't always have hyper-high performance because if I need to do something but I can't do it with a string I have to rely on
> delegates, which won't be optimized.
>   - Understanding foobar2 is harder than understanding foobar. First, it has two statements. :-P. Then I have to concatenate the strings in my head to form
> the resulting function. And that one is a simple one, so for a complex one it must be even harder to understand.
>
>  From all of these points I can see string mixins are not a very good idea to be used as callback functions, the main reasons being: it doesn't always work,
> it's harder to read and to write and it makes the object file bigger.
>
> So my suggestion is: optimize delegates! Take some time to optimize them to the maximum and then you won't need string mixins. If foobar here can be
> optimized like foobar2, that's it! Why would anyone want to write a function like foobar2 if the resulting code will be the same?
>
> So here's an idea, it just ocurred to me while writing this post. In the code above, the compiler sees this:
>
> writefln("%d", foobar((int x) { return 2*x; }));
>
> foobar is this:
>
> int foobar(int delegate(int) f) {
>     return f(1);
> }
>
> It sees foobar takes a delegate, so what it does it creates a function foobar_1 (or some name not already used) that results from rewriting foobar with the
> delegate inlined:
>
> int foobar_1(int delegate(int) f) {
>     return 2*1;
> }
>
> So now we have:
>
> writefln("%d", foobar_1((int x) { return 2*x; }));
>
> It then optimizes that line and ends up with this:
>
> writefln("%d", 2);
>
> (and I checked if dmd inlines that call, because it passes a delegate it doesn't use, and yes, it is being inlined).
>
> Then the compiler can discard foobar_1 because it doesn't use it anymore. No object file bloat, full speed.
>
> What if the delegate is more complex? Let's try:
>
> writefln("%d", foobar((int x) { int a = 10; return a*x; }));
>
> int foobar_1(int delegate(int) f) {
>     int delegateResult;
>     {
>       int a = 10;
>       delegateResult = a*1; // 1 was x
>     }
>     return delegateResult;
> }
>
> More complex:
>
> int bar(int y) {
>    return y * 3;
> }
> writefln("%d", foobar((int x) { return bar(x); }));
>
> For this case... I don't know. Maybe it could define bar inside of foobar_1 again, or maybe rewrite foobar_1 to accept another delegate and pass bar to it,
> and then reapply this algorithm.
>
> Anyhow... maybe what I say doesn't make sense at all, it's the wrong way to optimize code or whatever. My point is: if the compiler were smarter I wouldn't
> need string mixins to get my code optimized. If you are going to define fancy functions like map, sort, and other functions that basically receive callback
> functions, optimize those callback to the maximum, don't make the users optimize those callbacks by using string mixins. It's very frustrating for the
> user, when she has to sit down and write a high-performance function, to think "Ok, how can I write this code so the compiler already sees it as being very
> easy to optimize?". With D it's not about thinking about the data-structures to use and the algorithms to use anymore, it's about thinking about writing
> code that looks easy to optimize for the compiler. :-(

I completely agree with this post.

-- 
/Jacob Carlborg