Absolutely horrible default string hashing

Steve Teale steve.teale at britseyeview.com
Sun May 3 11:20:38 PDT 2009


Kristian Kilpi Wrote:

> On Sun, 03 May 2009 15:33:12 +0300, Jérôme M. Berger <jeberger at free.fr>  
> wrote:
> > |
> > | foreach(c; str)
> > | {
> > |     ret = (ret << 4) + c;
> > | }
> > |
> > 	That one is very bad because it only takes into account the last
> > few characters of the string (how many depends on the size of the
> > hash). However, several hashing algorithms use 2^n+1 as their
> > multiplier, which is very fast even on old/small/embedded hardware
> > because it can be implemented as a shift and an addition.
> 
> You are of course right; thanks for the correction. Lets scrap that  
> algorithm! :)

Would something like

foreach(i, c; str)
{
   ret ^= cast(uint) c;
   if (i & 1)
      ret <<= 1;
}

do a better job. That could cover the first 64 chars reasonably well, and I suspect that could cover a lot of use cases.




More information about the Digitalmars-d mailing list