Absolutely horrible default string hashing

Kristian Kilpi kjkilpi at gmail.com
Sun May 3 05:19:51 PDT 2009


On Sun, 03 May 2009 04:59:26 +0300, BCS <none at anon.com> wrote:
>
> IIRC something like this is common:
>
> hash_t ret = 0;
> foreach(c;str) { ret *= SomePrime; ret += c; }
>

I think that's the basis for the best general string hashing funcs around.  
IIRC from the university, it doesn't matter, in practice, if the  
multiplier is a prime number or not. So, the multiplication can be  
replaced with bit shifting (that's as fast as the addition operation).
E.g.

foreach(c; str)
{
     ret = (ret << 4) + c;
}

I quickly googled around and found the algorithm djb2 which uses the  
multiplier 33:
http://www.cse.yorku.ca/~oz/hash.html

djb2 is obviously a little bit slower but it *may* give slightly better  
distribution (I'm not gonna run tests now to see if there is a real world  
difference... ;) ).



More information about the Digitalmars-d mailing list