storing the hash multiplier instead of the hash value

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Mon Mar 22 13:04:56 PDT 2010


On 03/22/2010 02:50 PM, Daniel Keep wrote:
>
> How about just *fixing* the hashtable so it doesn't generate false
> pointers in the first place?  Maybe I'm just strange, but that seems
> like a more direct, efficacious solution...

This was of course the first thing Walter and I discussed. It turns out 
that that would necessitate precise GC, which we don't have yet.

> As for the redundancy, I was under the impression that the hash was
> cached so make resizing more efficient: if the number of buckets
> changes, you have to re-insert all the nodes.  If you don't store the
> hash, you have to recompute it (which is expensive for anything other
> than ints (wherein D laughably 'hashes' by returning the ints value)).

What would be a better hashing scheme for ints?

> Given that I already think this is a silly way to go about solving the
> false pointer issue, I don't see any benefit to doing this other than to
> give the CPU something to do while it waits for memory reads.
>
> Sadly, I lack the background to make any sort of objective judgement of
> the hash function *itself*, so I'll just reiterate what I've heard
> repeated to me over and over again by numerous people: D's builtin hash
> function sucks like a black holes.

Then you might be glad to hear that Walter has recently improved the 
default hashtable implementation significantly (at least for heavy-duty 
applications).

We're using Paul Hsieh's hash function for strings and general memory, 
which is no slouch.

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/util/hash.d

Better suggestions are always welcome. For integrals I'm unclear on what 
we could use to make things better. (Clearly we could and should get rid 
of the extraneous field.)


Andrei



More information about the Digitalmars-d mailing list