Expected performance

bearophile bearophileHUGS at lycos.com
Wed Nov 7 02:31:49 PST 2007


Martin Hess:
> The test builds a dictionary of 50k byte arrays that map to integers.
> int[byte[]] dictionary;
> Next it looks up elements a half billion times.

>From my tests I think D AAs aren't much optimized yet, in the past I have shown simple Python code that uses Python dicts that runs faster than D code (you can find it in the group or I can try to find it again), and I have done similar tests with C# 3.0 beta with similar results. To have fast hash maps you need a good hash function and good hash collision management. And if you put tons of keys like that you also need a good GC (or memory management). With time D will probably improve, D developers can take a look at how Python implemens dicts, to copy them all (Perl dicts may be fit too).

With your situation you have both lot of memory to manage and a large data block to compute hash on, so you need both quick memory management and and a fast hash function. For the memory management I can't help you much, but for the hash function you may take a look at the assembly version of the superfasthash:

http://www.azillionmonkeys.com/qed/hash.html
http://www.team5150.com/~andrew/blog/2007/03/breaking_superfasthash.html
http://www.team5150.com/~andrew/blog/2007/03/when_bad_hashing_means_good_caching.html

Bye,
bearophile



More information about the Digitalmars-d mailing list