Hash table element existence check

Illuminati via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Sep 3 05:40:08 PDT 2016


On Saturday, 3 September 2016 at 09:43:04 UTC, Basile B. wrote:
> On Friday, 2 September 2016 at 19:38:34 UTC, Illuminati wrote:
>> I am trying to create a hash table and would like an efficient 
>> way to be able to know if an element exists to test for 
>> collisions.
>>
>> I could keep a bitarray, but wasting around 12% space. I could 
>> use pointers(null check) to elements but this creates 
>> fragmentation. It is not terrible, just curious if anyone has 
>> a better way?
>
> fragmentation is a consequence of the hash function. You should 
> set the hasher as a template parameter so that, according to 
> the value type, the best hash fun (the one that creates less 
> clustering) can be supplied.
>

I mean memory fragmentation. If the key and value are structs, 
The memory is an array of tuple(key, value). Any scanning and 
returning are quite efficiency since all the tuples are next to 
each other. If they are pointers to the key/value then they could 
point to any location in memory. Scanning and such are far more 
likely to create cache misses. Maybe not a big deal for simple 
one-time access but in other cases it could be extremely 
slow(such as iterating over the table).

> But otherwise the buckets is almost always an array of 
> ReturnType!hashFun with the max value wrapped around the next 
> power of two value following entry count.

My hash table is simply a fixed array of type X = tuple(key, 
value). X is at location key.hashOf % length(more or less). When 
the table becomes too small, it is enlarged and everything is 
rehashed. But keys and values can be values or references and 
this changes the behavior. references can be checked for null, 
but values can't.








More information about the Digitalmars-d-learn mailing list