hashed array?

H. S. Teoh hsteoh at quickfur.ath.cx
Mon Nov 12 11:07:48 PST 2012


On Mon, Nov 12, 2012 at 09:58:14PM +0400, Dmitry Olshansky wrote:
> 11/12/2012 9:44 PM, H. S. Teoh пишет:
[...]
> >For storing integer types, you could optimize for space by dividing
> >your key space into blocks of, say, 32 entries each. Then your hash
> >function will hash only the upper (n-5) bits of the key, and each
> >hash slot will be a bit array of 32 bits, which are indexed by the
> >lower 5 bits of the key. Assuming your keys are relatively densely
> >populated, this will save quite a significant amount of space. In the
> >sparse case, you probably don't waste that much space either, since
> >each slot is only one 32-bit int wide.
> 
> Problem is that keys will collide so each slot will have to have upper
> bits of key *and* the small set. The bucket itself is still inside
> intrusive linked list.

True.


> Otherwise the idea is great. About 3 words of overhead per word-sized
> set.
> 
> >
> >You can probably also increase the size of the slots to, say, 64 bits
> >or more, if you want to maximize the usage to GC allocation block
> >overhead ratio. Yes, each slot is only 1 int wide, but the GC does
> >have to maintain bookkeeping information that's probably a lot more
> >than that.
> 
> 16 bytes. Though you'd have to know how bucket entry looks like
> which is impossible with built-in hash map sadly.
[...]

Actually, the built-in AA's bucket struct is duplicated in object_.d;
it's what the range-based API uses to walk the AA. (Yeah it's extremely
ugly. It must be cleaned up.)


T

-- 
There is no gravity. The earth sucks.


More information about the Digitalmars-d mailing list