Absolutely horrible default string hashing
Kristian Kilpi
kjkilpi at gmail.com
Sun May 3 05:19:51 PDT 2009
On Sun, 03 May 2009 04:59:26 +0300, BCS <none at anon.com> wrote:
>
> IIRC something like this is common:
>
> hash_t ret = 0;
> foreach(c;str) { ret *= SomePrime; ret += c; }
>
I think that's the basis for the best general string hashing funcs around.
IIRC from the university, it doesn't matter, in practice, if the
multiplier is a prime number or not. So, the multiplication can be
replaced with bit shifting (that's as fast as the addition operation).
E.g.
foreach(c; str)
{
ret = (ret << 4) + c;
}
I quickly googled around and found the algorithm djb2 which uses the
multiplier 33:
http://www.cse.yorku.ca/~oz/hash.html
djb2 is obviously a little bit slower but it *may* give slightly better
distribution (I'm not gonna run tests now to see if there is a real world
difference... ;) ).
More information about the Digitalmars-d
mailing list