Expected performance

Lionello Lunesu lio at lunesu.remove.com
Thu Nov 8 01:15:13 PST 2007


Martin Hess wrote:
> I'm considering D for a project that is very CPU intensive so I'm evaluating D's performance. As a first test I started with associative arrays and ended up with a result that surprised me a bit so I would like to get some input on the test's construction.
> 
> The test builds a dictionary of 50k byte arrays that map to integers.
> 
> int[byte[]] dictionary;
> 
> Next it looks up elements a half billion times.
> 
> Assuming 1.5 instructions per cycle I'm seeing about 387 instructions per lookup.
> 
> I tried it in Java and was seeing about 309 instructions per lookup.
> 
> Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this?
> 
> Test details:
> 
> Mac OSX 10.5
> GDC 4.0.1
> Java 1.5.0_13
> 
> ================
> 
> import std.stdio;
> import std.random;
> 
> char[] letters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
> 
> byte[] makeGUID(int size)
> {
> 	byte[] a = new byte[size];
> 	
> 	for (int i = 0; i < size; i++)
> 	{
> 		a[i] = letters[rand() % letters.length];
> 	}
> 	
> 	return a;
> }
> 
> const int tableSize = 50000;
> 
> version(Unix)
> {
> 	private import std.c.unix.unix;
> 	
> int main (char[][] args)
> {
> 	byte[][tableSize] guids;
> 	int[byte[]] dictionary;
> 		
> 	for (int i = 0; i < tableSize; i++)
> 	{
> 		byte[] guid = makeGUID(26);
> 		guids[i] = guid;
> 		dictionary[guid] = i;
> 	}
> 	
> 	dictionary.rehash;
> 	
> 	ulong start = time(null);
> 	for (int i = 0; i < (500 * 1000 * 1000); i++)
> 	{
> 		byte[] guid = guids[i % guids.length];
> 		int j = dictionary[guid];
> 	}
> 	ulong stop = time(null);
> 	
> 	writefln(stop-start);
> 	
> 	return 0;
> }
> }

A toHash for a GUID could basically just return the first 4 bytes of the 
GUID as hash:

struct Guid
{
   byte[] guid;// or fixed size?
   uint toHash() { return *cast(uint*)byte.ptr; }
}
int[Guid] dictionary;

I've boosted AArray filling/reading in the past by temporarily disabling 
the garbage collector (std.gc.disable()/enable()).

Anyway, if performance doesn't turn you to D, surely this will:

#  for (int i = 0; i < 500_000_000; i++)

L.



More information about the Digitalmars-d mailing list