Futurism lib (futures in D)

Sun Jan 21 17:57:06 PST 2007

== Quote from Saaa (empty at needmail.com)'s article
> I read wikipedia about future and it sounds interesting.
> Do you have a bit more information, or some place where I can get it?
> As I read it, futurism takes care of threading and synchronization of data
> (between threads).

Let's say I needed to find the sum of an array.

int sum(int[] x)
{
    int rv = 0;
    foreach(int y; x) {
        rv += y;
    }
    return rv;
}

On a multicpu machine, this operation will run on one CPU.  With futures, I can
write another routine that utilizes up to 4 CPUs:

int sum_4(int[] x)
{
    alias Future!(int) FInt;
    int part = x.length/4;

    // Each of these queues a work item to be done either on some
    // other thread, or by the main thread here.

    FInt x1 = new FInt({ return sum(x[0..part]); });
    FInt x2 = new FInt({ return sum[part..part*2]); });
    FInt x3 = new FInt({ return sum[part*2..part*3]); });
    FInt x4 = new FInt({ return sum[part*3..$]; });

    // The addition below waits for each results to complete
    // before adding it to the total.

    return x1.value + x2.value + x3.value + x4.value;
}

In this case, the four futures will split the array into four parts and queue each
quarter as a work item.  The four parts will be done in parallel.  It should take
about 1/4 as long, minus some overhead for building the future objects and so on.

The idea is that any time a function needs to do several things that don't depend
on each other, they can be represented as futures and the program will
run in parallel.

My motivation for writing this was a talk by Herb Sutter, its available on Google
video here:

http://video.google.com/videoplay?docid=7625918717318948700&q=sutter

It's pretty long, but interesting I thought -- his basic argument is that
computers have gotten faster up till now mostly because the CPU is higher
megahertz with better pipelining.  But from now on, computers will run faster
mostly because they have a lot of CPUs or cores working together.

We are seeing the first of this with the dual core and quad core CPUs they are
producing now.

But the *software* won't actually run any faster unless the programs can be
written in a multithreaded way.  You will have the same "Photoshop" running at the
same speed on one CPU, with other CPUs idling away doing nothing.

The problem is that it's hard to write good applications that take advantage of
the extra CPUs.  It's very hard to write good code with just Thread and Mutex
semantics, and monitor/sleep is only a little better.

Worse, if I write a library that uses mutexes to synchronize access, I can't
really combine it with another library that does the same thing, because to design
mutex-using code that doesn't deadlock, you need to know everything about the
locking patterns for the whole application.  If you combine two libraries that
can't deadlock, the result often *will* have potential deadlocks.

But, with the future idiom, you can write code that uses extra threads to do work
whenever you have two calculations that don't depend on each other.  In the
example above, the sum of four arrays can be computed seperately.  In the Futurism
source code on dsource, there is a qsort algorithm that uses futures to do work in
parallel.

Kevin