GC.calloc with random bits causes slowdown, also seen in built in AA

Michael Rynn michaelrynn at optusnet.com.au
Wed Mar 10 17:33:32 PST 2010


Looked at the Associative Arrays project on http://www.dsource.org/
projects/aa, with the idea of testing and maybe uploading a trie class 
(radix tree)..

Ran the test program for PyDict noted the average result for consecutive 
runs of the one test it did, for  uint[unit]  associative arrays.

 This was very irritating, because of long wait for command line output. 
The test.d default sizes and number of runs were very big and only put 
out the average result at the end.

Changed test.d to output each individual run time.
The time for each individual run got progressively longer.

This was fixable by making the calloc block attribute be 
GC.BlkAttr.NO_SCAN.  So maybe the randomized uint bits confuse the 
garbage collector as it scans through.

In the PyDictD1 test code there was a call to disable the garbage 
collector. Thats cheating.

I looked at the code for PyDictD2.d  and I decided that the table of 
struct Entry,  holding the key and value, for which calloc is used , 
could be replaced by a more D - like (its a template anyway) call to new 
Entry[size].  The size is always a power of 2. 

Is it better to use new Entry[] rather than calloc,  since the GC has a 
map of which values are pointers and which are not?  Hopefully just like 
real D code.

Ran the test again after removing the calloc, and the speed improved and 
the progressive run time increase went away.

For comparison, I also put in a wrapper class template for the built in D 
AA type, and did the same tests. For uint[unit] this also had a 
progressive slow down in run-time.  Same issue found for the D1 builtin 
AA.

So the built-in associative array type of D suffers from this disease.
(under very stressful array size parameters of course). It might affect 
long running programs with big builtin AA tables. 


The mystery is  why was the slowdown progressive.  After each run the AA 
variable is deleted, and the GC gets told to delete the calloc array 
directly. If the previous block of memory is properly gone, and a new 
cleared block created, shouldn't the run results be the same each time? 

So now the PyDict does not use calloc. and is faster than the built-in. I 
added a few missing properties (length, opIndex) 

The tests for string[uint] were re-instated.  The built in AA had some 
progressive run slowdown tendency for this as well, although not as 
marked.

The trie class (which has been submitted) had made was miserably slow in 
comparison to the PyDict for inserts.  Lookups were much faster, but 
still not in the same ballpark as AA. 
 
My prior excitement about the lookup speed of the radix tree was actually 
due to an embarrassment of having code in an assert statement removed on 
the release compile.

The trie class however, at least can do prefix key completion well.

PyDictD1 is updated with the same change. 
The code is updated in SVN, so anyone can check out the changes. (and 
even update with any Dsource login).

--tof--

Michael Rynn









More information about the Digitalmars-d mailing list