Expected performance
Lionello Lunesu
lio at lunesu.remove.com
Thu Nov 8 01:15:13 PST 2007
Martin Hess wrote:
> I'm considering D for a project that is very CPU intensive so I'm evaluating D's performance. As a first test I started with associative arrays and ended up with a result that surprised me a bit so I would like to get some input on the test's construction.
>
> The test builds a dictionary of 50k byte arrays that map to integers.
>
> int[byte[]] dictionary;
>
> Next it looks up elements a half billion times.
>
> Assuming 1.5 instructions per cycle I'm seeing about 387 instructions per lookup.
>
> I tried it in Java and was seeing about 309 instructions per lookup.
>
> Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this?
>
> Test details:
>
> Mac OSX 10.5
> GDC 4.0.1
> Java 1.5.0_13
>
> ================
>
> import std.stdio;
> import std.random;
>
> char[] letters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
>
> byte[] makeGUID(int size)
> {
> byte[] a = new byte[size];
>
> for (int i = 0; i < size; i++)
> {
> a[i] = letters[rand() % letters.length];
> }
>
> return a;
> }
>
> const int tableSize = 50000;
>
> version(Unix)
> {
> private import std.c.unix.unix;
>
> int main (char[][] args)
> {
> byte[][tableSize] guids;
> int[byte[]] dictionary;
>
> for (int i = 0; i < tableSize; i++)
> {
> byte[] guid = makeGUID(26);
> guids[i] = guid;
> dictionary[guid] = i;
> }
>
> dictionary.rehash;
>
> ulong start = time(null);
> for (int i = 0; i < (500 * 1000 * 1000); i++)
> {
> byte[] guid = guids[i % guids.length];
> int j = dictionary[guid];
> }
> ulong stop = time(null);
>
> writefln(stop-start);
>
> return 0;
> }
> }
A toHash for a GUID could basically just return the first 4 bytes of the
GUID as hash:
struct Guid
{
byte[] guid;// or fixed size?
uint toHash() { return *cast(uint*)byte.ptr; }
}
int[Guid] dictionary;
I've boosted AArray filling/reading in the past by temporarily disabling
the garbage collector (std.gc.disable()/enable()).
Anyway, if performance doesn't turn you to D, surely this will:
# for (int i = 0; i < 500_000_000; i++)
L.
More information about the Digitalmars-d
mailing list