Performance of hashes and associative arrays

Sun Nov 11 03:38:28 PST 2012

Jakse wrote:

> It would also be interesting to have ideas for the general
> case 

Yes, ideas might be interesting. :-)

A root of "good" hashing is incorporated in algorithm2 of your 
posting:

spread the values produced by the hash function uniformly over 
the interval  size_t.min .. size_t.max.

Therefore:

Compute the hash value for a given key by multiplying each byte 
of the key with a factor, which increases from the byte with 
lowest significance to the byte with highest significance; of 
course add the results.

Remarks:
a)
Because of the general case: assume that the original key 
contains much redundance. Therefore do not use the bytes of the 
original key but compute a ( at least close to) lossless 
compression of the original key.

b)
The factors in the factor list should be primes for spreading 
the hash value uniformly over the intended range

c)
The quotint of two consecutives primes in the factor list should 
be greater than 256, because the used key might not contain any 
reduundancy.   

-manfred