storing the hash multiplier instead of the hash value
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Mon Mar 22 10:51:37 PDT 2010
Currently, each entry in a D hashtable stores the hash of the key for
efficiency purposes.
There is a bit of redundancy: as soon as you entered a hash bucket you
know that the hashkey % tablesize is a specific number, call it k. But k
is not enough information to restore the hash, so the actual hash gets
stored for each entry.
I'm thinking of exploiting that redundancy by storing the hash
multiplier, not the hash value. Instead of storing h in each slot, store
m = h / tablesize. Then you can restore the actual h by writing:
restored_h = m * tablesize + k;
k is known for each given bucket (it's the index of the bucket) and m is
what gets stored per entry.
What's the advantage of this approach? Well the advantage is that m is a
small number. Any hash function will try to disperse the hash value as
much as possible between the 32 available bits. But m will be a smaller
number, and therefore will be more grouped and will engender fewer false
pointers.
Would this help?
Andrei
More information about the Digitalmars-d
mailing list