Associative Array: reasonable limits?

Dmitry Olshansky dmitry.olsh at gmail.com
Sun Nov 3 01:46:09 PDT 2013


03-Nov-2013 02:37, Charles Hixson пишет:
> I'm contemplating an associative array that will eventually grow to be
> an estimated 64KB in size, assuming it's about half full.  It would then
> be holding around 90,000 entries.  Is this reasonable, or should I go
> with a sorted array, and do binary searches?  I estimate that binary
> searches would average around 15 accesses/hit, but that's still a lot
> faster than disk access.  The array would be nearly full, so there would
> be less wasted space, but, of course, insertions would become quite
> expensive.  (Deletions aren't expected.)  In either case the keys would
> be fixed length character arrays, and the values would also be simple
> structs without reference types (so the garbage collector could pretty
> much ignore things).

90k elements is a small AA. It would work just fine with any sane hash 
function (e.g. the built-in one).

At around 10M+ elements it may be worth to consider space saving 
techniques. That said on x64 100M+ is perfectly OK, on 32bit I'd advise 
against it but only because of GC.

> FWIW I'm expecting this array to be around the entire time the program
> is running, but it would probably be rolled out most of the time, as
> frequently accessed items would be pulled into a much smaller structure,
> and if I go with the array rather than the hash table I may only do
> updates as a batch.  (Since it's linear with the size of the array, but
> essentially ignores the number of items being added.)
>
> My temptation is to go with the hash table...but I'm not sure of the
> underlying mechanics.

Think of it as a sparse array that has "holes" and takes around ~2-4 
times the size of normal array. It's not only because of holes but 
because a slot is a bit bigger.
In return lookups/inserts/deletions are cheap O(1) with high probability.

-- 
Dmitry Olshansky


More information about the Digitalmars-d-learn mailing list