AA rehash threshold

Steven Schveighoffer via Digitalmars-d digitalmars-d at puremagic.com
Thu Nov 20 18:18:54 PST 2014


On 11/20/14 8:42 PM, Jerry Quinn wrote:
> Steven Schveighoffer <schveiguy at yahoo.com> writes:
>
>> On 11/20/14 5:30 PM, Jerry Quinn wrote:
>>> This works nicely for small types, but has gotchas.  For example, if
>>> you've got an AA of ints, what value indicates that this is a value
>>> folded into the bucket entry vs actually being a pointer?  You'll need
>>> extra space to make sure it's safe.  Alignment is another concern.
>>> Let's say you have bool for the entry/element flag.  With ints you've
>>> doubled the size of the bucket array.
>>
>> Hm.. the bucket entry will simply be what a node is now. You are actually
>> saving space, because you don't need to store that extra pointer to the first
>> node.
>
> Almost. You do need a way to indicate that the bucket is empty.  So you
> need another flag though it can be smaller than a pointer (for 64 bit).  But
> the overhead is low except for things like ints.

It's easy enough to set the "next" pointer to some invalid-but-not-null 
value. Hackish, but it would work ;)

>> Where it gets dicey is when you rehash. Some of the nodes will be moved into
>> the bucket space, some will move out of it. But I don't think it's a big deal,
>> as a rehash doesn't happen very often.
>
> True.  Umm, that does raise a question - are addresses of AA contents
> required to be stable?  C++11 requires that of its hashtables.  But it
> ties your hands when trying to write customized hash tables.

I don't think we make any guarantees about that. As it stands now, 
whatever is implemented seems to be what people think is the spec. But 
it's not always true.

We have so many more options when AA's become a library type. We could 
say "the builtin AA is this flavor of hashtable", but provide options 
for many different flavors with the same template if you want specific ones.

-Steve


More information about the Digitalmars-d mailing list