Hash table element existence check
Illuminati via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sun Sep 4 16:25:54 PDT 2016
On Saturday, 3 September 2016 at 14:07:27 UTC, Cauterite wrote:
> On Saturday, 3 September 2016 at 12:33:26 UTC, Illuminati wrote:
>> On Saturday, 3 September 2016 at 07:44:28 UTC, Cauterite wrote:
>>> On Friday, 2 September 2016 at 19:38:34 UTC, Illuminati wrote:
>>>> I am trying to create a hash table and would like an
>>>> efficient way to be able to know if an element exists to
>>>> test for collisions.
>>>
>>> Just do a regular lookup on the hash? It's an O(1) operation,
>>> like 4 instructions.
>>
>> Huh? One can look up fine, but how does one know if the result
>> is valid or not?
>
> Okay I think I misinterpreted the question. I believe when I
> did this my solution was to have an additional template
> parameter specifying null key values, so the template was like
> this:
>
> struct HashTbl(
> K, /* key type */
> V, /* value type */
> K NullKey, /* key placeholder value */
> alias hash_key, /* hash function */
> alias keys_eq /* equality function */
> ) {...
>
> I guess this doesn't really solve the problem, but makes it the
> user's problem.
Yeah, and there might not be any "null" keys. If there is, then
it is easy, But, say the key is an int... what is "null"? While
it does let the user define what null means, it assumes that it
is possible to do so and that might not always be the case.
I suppose one could then switch to pointers and waste a bit of
space if necessary. The main point of the question was if there
were any tricks that avoided this problem. E.g., maybe the hash
function itself could be used to determine if the key exists. Not
sure how but who knows?
>> I could keep a bitarray, but wasting around 12% space.
>
> That assumes your (K,V) tuples are 1 byte total, right?
Yeah. Larger values will reduce the size to some degree. Really,
it is simply N/8 wasted space. For large N it becomes a problem.
Maybe not a huge deal since virtually most of the time N < 1M,
and it would be around 125k bytes for such a large array. Large
arrays could probably be handled differently if memory
constraints were such an issue.
More information about the Digitalmars-d-learn
mailing list