Cilk/Cilk++

bearophile bearophileHUGS at lycos.com
Thu Sep 4 15:49:21 PDT 2008


Charles E. Leiserson:

>If you have a deep interest, I'd be happy to discuss what would be involved in inserting Cilk technology into D.<

D is a free language, so probably that discussion can't give much money directly to Cilk Arts. But if such keywords and semantics (what you call "cilk technology") become more widespread among other languages (recently I have seen newLisp adopt them) then Cilk Arts may gain something indirectly. Walter Bright may be interested in some of the Cilk keywords, I don't know.

---------------

Related to Cilk I have just read this quite interesting article from the blog in your site:
http://www.cilk.com/multicore-blog/bid/6381/Multicore-enabling-the-N-Queens-Problem-Using-Cilk

>From that article:

>After all, it's always good to have the serial version of a parallel implementation at hand.<

Think about this alternative version of that phrase, told by an hypothetical vendor of one of the first C compilers:

"After all, it's always good to have the assembly version of a C implementation at hand."

It tells me that such technologies aren't much ripe yet :-)


>Parallelizing a piece of code with Cilk++ is easy.<

Well, it's probably doable (and probably better than some other parallel programming technologies), but for my limited brain, and limited knowledge of Cilk, it doesn't seem 'easy' :-)

One fault of that code is that it doesn't show a comparison of that Cilk++ code with the performance of code written for a normal C compiler, like GCC.

If you compare the performance of that 2/4-core implementation with the following single core C version you may have a surprise:


#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
typedef unsigned long ulong;
static const ulong ulong_bit = sizeof(ulong) * CHAR_BIT;

static inline ulong search(ulong lb, ulong cb, ulong rb, ulong cnt) {
  if (~0ul == cb)
    cnt += 1;
  else
    for (ulong bs = lb | cb | rb; ~0ul != bs;) {
      ulong b = ~bs & (bs+1);
      bs |= b;
      cnt = search((lb | b) << 1, cb | b, (rb | b) >> 1, cnt);
    }
  return cnt;
}

static inline ulong nQs(ulong m) { return search(0, ~0ul >> m, 0, 0); }

int main(int argc, char* argv[]) {
  ulong a = argc < 2 ? ulong_bit : atol(argv[1]);
  ulong n = a < ulong_bit ? a : ulong_bit;
  for (ulong i=1; i<=n; ++i)
    printf("%li: %li total solutions\n", i, nQs(i));
  return 0;
}

That code comes from:
http://c2.com/cgi/wiki?EightQueensInManyProgrammingLanguages

-----------------

So far, one of the most promising technologies I see for parallel computations can be seen from this article + supplementary data:
"BSGP: Bulk-Synchronous GPU Programming", by Qiming Hou, Kun Zhou, Baining Guo, that you can find here:
http://www.kunzhou.net/

Bye and thank you,
bearophile



More information about the Digitalmars-d mailing list