storing the hash multiplier instead of the hash value

Mon Mar 22 13:04:56 PDT 2010

On 03/22/2010 02:50 PM, Daniel Keep wrote:
>
> How about just *fixing* the hashtable so it doesn't generate false
> pointers in the first place?  Maybe I'm just strange, but that seems
> like a more direct, efficacious solution...

This was of course the first thing Walter and I discussed. It turns out 
that that would necessitate precise GC, which we don't have yet.

> As for the redundancy, I was under the impression that the hash was
> cached so make resizing more efficient: if the number of buckets
> changes, you have to re-insert all the nodes.  If you don't store the
> hash, you have to recompute it (which is expensive for anything other
> than ints (wherein D laughably 'hashes' by returning the ints value)).

What would be a better hashing scheme for ints?

> Given that I already think this is a silly way to go about solving the
> false pointer issue, I don't see any benefit to doing this other than to
> give the CPU something to do while it waits for memory reads.
>
> Sadly, I lack the background to make any sort of objective judgement of
> the hash function *itself*, so I'll just reiterate what I've heard
> repeated to me over and over again by numerous people: D's builtin hash
> function sucks like a black holes.

Then you might be glad to hear that Walter has recently improved the 
default hashtable implementation significantly (at least for heavy-duty 
applications).

We're using Paul Hsieh's hash function for strings and general memory, 
which is no slouch.

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/util/hash.d

Better suggestions are always welcome. For integrals I'm unclear on what 
we could use to make things better. (Clearly we could and should get rid 
of the extraneous field.)

Andrei