storing the hash multiplier instead of the hash value

Tue Mar 23 08:10:04 PDT 2010

On 03/23/2010 08:11 AM, Michael Rynn wrote:
> On Mon, 22 Mar 2010 16:59:36 -0500, Andrei Alexandrescu wrote:
>
>> On 03/22/2010 04:03 PM, Andrei Alexandrescu wrote:
>>> On 03/22/2010 03:36 PM, Walter Bright wrote:
>>>> Andrei Alexandrescu wrote:
>>>>> Better suggestions are always welcome. For integrals I'm unclear on
>>>>> what we could use to make things better. (Clearly we could and should
>>>>> get rid of the extraneous field.)
>>>>
>>>> Unfortunately, it won't be much of a win. Memory allocation is done in
>>>> buckets of size 16, 32, 64, etc. Reducing the node size for a
>>>> uint[uint] from 16 to 12 saves no memory at all.
>>>
>>> As we discussed, if nodes are allocated in bunches, you could store 5
>>> nodes in a 64-byte block instead of 4. That's more than a 25% increase
>>> in effectiveness because the per-block bookkeeping is also slashed by
>>> 5.
>>>
>>> Andrei
>>
>> One more idea before I forget: the bunches approach requires using a
>> freelist for nodes that are available but not used. (Freelists are a
>> good idea whether or not bunches are used.)
>>
>> One good possibility is to store the head of the freelist in a static
>> variable, such that e.g. all int[string] instances use the same
>> freelist. And there's no thread contention issue because each thread has
>> its own freelist.
>>
>>
>> Andrei
>
> I just committed a aaA.d version that uses some heap node memory
> management, although its on a per class instance basis. Also it dispences
> with a separate hash storage for keys<= size_t.  Sharing would mean some
> working out of which classes share the same sized node blocks.
> Much easier to implement class sharing using templates.
>
> See the code here : http://www.dsource.org/projects/aa/browser/trunk/
> druntime/aaA.d
>
> See the benchmarks and comments here : http://www.dsource.org/projects/aa/
> wiki/DrunTimeAA.
>
> The result squeezes some extra performance for integer or less sized keys
> (about 20% faster for lookups).
>
>
> For the druntime, the generic C interface constrains the kinds of
> specialized AA versions that can be instantiated using run-time TypeInfo
> and var-arged calls. Maybe a class / interface direct calls would work
> better. Just imagine at runtime looking at the TypeInfo_AssociatedArray
> and trying to work out exactly which template is going to be instantiated.
>
> And having a low code size, one or few versions fit  all approach, that
> performance sacrifice is unavoidable in a basic runtime libary.
>
> But in template land , options and combinations and gains abound.

Absolutely!

This is solid work. It would be interesting to assess how your 
implementation and the old built-in hashes compare with Walter's new 
implementation, which uses a singly-linked list instead of a tree. Do 
you have what it takes to check out and build druntime and phobos?

Andrei