The future of lambda delegates

Wed Aug 16 07:55:38 PDT 2006

D's delegates are great.  They are vastly superior to C++'s confused 
pointer-to-member functions, and much more useful than Java's adapter 
classes.  In the world of system programs, nothing comes close.  One 
interesting type of delegate is the anonymous-nested-delegate-literal, or 
"lambda" delegate.  For those unacquainted with such niceties, here are the 
documentation links:
http://www.digitalmars.com/d/function.html#nested
http://www.digitalmars.com/d/expression.html#FunctionLiteral

Recent D releases (notably 0.161) have updated the syntax and improved the 
overall usability of lambda delegates. All the syntactic sugar is nice, but 
it also lays a few traps for the unwary.  Here is a simple example:

// The Fibonacci numbers are an integer sequence such that
//        F(n) = F(n - 1) + F(n - 2)
//        And F(0) = 0, F(1) = 1
int delegate() fibs()
{
    int a = 0;            // Initialize the last two terms of the Fibonacci 
sequence
    int b = 1;

    return
    {
        int c = a + b;     // Calculate the next term in the sequence
        a = b;               // Shift the previous terms back
        b = c;
        return c;            // Return the result
    };
}

This function returns a function which will sequentially evaluate all of the 
Fibonacci numbers.  Notice that the inner delegate modifies the variables a 
and b in fibs() scope.  Because of this, it is not guaranteed to work after 
fibs returns.  This is most irritating, and it greatly restricts the use of 
this technique. Another potential use for lambda delegates is to create 
one-line adapter methods.  Consider the following attempt to create a button 
which will display an arbitrary message when clicked:

Button createButton(char[] click_msg)
{
    Button b = new Button();
    b.mouseClickCallback = { MsgBox(click_msg); };
    return b;
}

Once more, this sort of method relies on access to the outer function scope 
from within the lambda delegate.  Given the current semantics, this code is 
doomed to fail in any number of random ways.  As a final example, suppose we 
want to perform a lengthy calculation in a separate thread, and only wait 
for the value once it is needed.  In this case, one could try to do the 
following:

int[] delegate() sort_threaded(int[] array)
{
    int[] result = array.dup;

    //Create and run a worker thread to perform an expensive operation, (in 
this case a sort)
    Thread tsort = new Thread(
     {
        result.sort;
        return 0;
     });
    tsort.start();

    //The returned lambda delegate waits for the thread's calculation to 
finish, then returns the result.
    return { tsort.wait; return result; };
}

In this situation, we can let the thread execute in the background while the 
program performs other tasks, and only wait for the result once we need it. 
This type of deferred calculation can be very useful for improving the 
amount of parallelism within an application without adding much 
synchronization overhead.  It is very sad that none of these examples work, 
since there are so many nice solutions just like them.

One possible solution is to allocate the frame for each function containing 
nested-functions on the heap.  This allows any returned delegates to use the 
member variables freely.  One could think of it as translating the function 
into a class with a thunk.  Here is above Fibonacci function rewritten in 
this way:

class fibs_
{
    int a = 0;
    int b = 1;

    int lambda1()
    {
        int c = a + b;
        a = b;
        b = c;
        return c;
    }

    int delegate() func()
    {
        return &lambda1;
    }
}

int delegate() fibs()
{
    return (new fibs_()).func();
}

Such a translation is possible for any of these examples, albeit quite 
tedious.  From what I gather C# applies a similar technique with its lambda 
delegates and inner classes.  For a machine code compiler, such a rewrite is 
not even necessary, since it could simply overwrite the frame pointer at the 
start of the function with a GC allocated block.  This would preserve 
frame-based addressing for arguments and variables, requiring no change in 
any of the internally generated machine code.  The run time overhead is 
fairly low, only on the order of one extra memory allocation per function 
call.  Practically, such an implementation would be extremely simple.

Any thoughts or comments?

-Mikola Lysenko