Associative Arrays need cleanout method or property to help
Walter Bright
newshound1 at digitalmars.com
Tue Mar 23 13:41:56 PDT 2010
Michael Rynn wrote:
> On Mon, 22 Mar 2010 00:52:24 -0700, Walter Bright wrote:
>
>> bearophile wrote:
>>> A way to know the usage patterns of D AAs is to add instrumentation in
>>> nonrelease mode, they can save few statistical data once in a while in
>>> the install directory of D, and then the user can give this little txt
>>> file to Walter to tune the language :-)
>> There's no need to tune the language. The implementation and data
>> structure of the AAs is completely opaque to the language. The
>> implementation is in aaA.d. Feel free to try different implementations
>> in it!
>
> Well, I have.
>
> See the code here : http://www.dsource.org/projects/aa/browser/trunk/
> druntime/aaA.d
>
> See the benchmarks and comments here : http://www.dsource.org/projects/aa/
> wiki/DrunTimeAA.
>
> The result squeezes some extra performance for integer or less sized keys
> (about 20% faster for lookups).
>
> Insertions and cleanups are faster too.
It's a nice implementation. There are some problems with it, though:
1. Deleted entries cannot be returned to the garbage collector pool unless the
entire hash table is removed.
2. All the keys are stored in a single array, as are all the values. This
requires that a contiguous chunk of memory can be found for the arrays when
rehashing to a larger hashmap. This could make it very difficult for this to
work when hash tables grow to be of significant size relative to all available
memory. Note that the current implementation has a bucket array that has as many
entries as the number of elements in the hashtable, but the element size is only
a pointer, not the arbitrary size of a key or value.
3. Rehashing (required as the hashmap grows) means that the key/value arrays
must be reallocated and copied. That's fine for the keys, but for the values
this is a huge problem because the user may have dangling pointers into the
values. (This is why Andrei started the thread entitled "An important potential
change to the language: transitory ref".) I think this is currently the killer
problem for randAA's approach.
More information about the Digitalmars-d
mailing list