Dynamic Code in D

bearophile bearophileHUGS at lycos.com
Sat Jan 12 10:24:54 PST 2008


Your set of problems is quite interesting. I am not much experienced in D yet, but I can say some of the things I think. There are lot of people here that know much more than me on such matters (but I have implemented many evolution-based programs), they will spot errors among the things I have said here.

D is a young language, it has lot of rough corners, so you have to forget the high level of polish that Java systems have today. I like D, and hopefully more and more people will start using it, but I think today it's not much fit to replace "serious" Java programs.

>I am considering doing this in D; I like the language design and there are potential performance and maintainability benefits.<

D is a very nice language, and it looks quite nice compared to Java, but note that most of the small (1000-10000 lines) programs I have translated from Java to D, the Java6 version was 2-10 times faster. HotSpot inlines methods, and it has a really good GC, while DMD currently seem to not inline them and its GC is snail-slow compared to the refined HotSpot (not the server one) one. Those two things alone make D code much slower if you use to code in the usual highly OOP style Java code has.


>The system carries out data mining tasks using algorithms generated by a genetic programming engine. The original prototype (developed before I joined the company) did this by construcing a Lisp-style function tree out of Java objects, then running that in an interpreter (i.e. it used ECJ). Unsurprisingly this was hideously slow.<

For that purpose CommonLisp looks like the a good language (beside Java itself). It's fast enough if used well, and it compiles the functions on the fly. There are some disadvantages, like it's not easy to find CLisp programmers.
How much time does a single fitness computation take on average? If it's really little time (less than 30-40 seconds) then compiling/interfacing timings become important. In that situation you need a fast compiler, etc.
 

>The dynamic java code calls this via the Java Native Interface (specifically, a class full of static native methods). However JNI has a significant call overhead, which is eating up a good fraction of the performance boost, as well as creating additional development/maintenance hassle. This system is running on a couple of beefy 16 core Opteron servers and it still takes hours to run on large datasets.<

Often the bigger gains in such evolutionary programs come from "improving" the search space, adding more heuristics, etc. Not from going from Java to D.


>2. High-performance synchronization. As of 1.6, Java's monitor implementation is quite impressive;<

I don't know the answer, but in most things D is far from being tuned and refined, etc. In that kind of things only C# may be better.


>3. SSE intrinsics. Does GDC have the equivalent of GCC SSE intrinsics yet<

Nope, I think.


>1. Output as C source plus a JNI stub in bytecode. Invoke gcc on it to build a dynamic library.<

If the running time of that fitness function is very little you may need to find something faster than GCC, like TinyCC (but I think that's not your situation, and most probably your fitness functions are small, so they are very quick to compile).

Look for other alternative languages/solutions too, like CLisp, Java TinyCC - compiled code, Python+Psyco with Cython compiled code on the fly, etc.

I think you can answer most of your questions about D doing some small benchmarks for 1-2 days only. Maybe you will like the language but not its performance for your purposes :-)

Bye,
bearophile



More information about the Digitalmars-d mailing list