Memory usage of AAs?
Nick Sabalausky
a at a.a
Tue Mar 29 19:20:05 PDT 2011
"spir" <denis.spir at gmail.com> wrote in message
news:mailman.2909.1301443345.4748.digitalmars-d-learn at puremagic.com...
> On 03/30/2011 01:24 AM, Nick Sabalausky wrote:
>> My understanding of hash tables is that they allocate a fixed size array
>> and
>> map keys to indicies within the range 0..predefined_length_of_the_AA.
>>
>> So I've been wondering, how many elements do D's built-in AAs have? And
>> what's the content of each one, just a single pointer?
>
> Each element is a data structure, often called bucket (typically a link
> list), storing (key:value) pairs for which the key, once hashed and
> modulo-ed, maps to the given index. That's why the famous O(1) lookup time
> for hash tables is very theoretic: the constant part holds average time
> for hashing which is very variable, plus another average time for linearly
> traversing the bucket. The latter part depends on table load factor (=
> number of elements / number of buckets) and proper statistical repartition
> of keys into buckets.
>
> The key (!) points are finding a good hash func to "linearize" said
> repartition (which depends on actual key-data domain, not only on their
> type...), but better ones rapidly eat much time (ones used in practice are
> rather simple & fast); and finding optimal load factor, and growth scheme.
> In practice, all of this tends to make hash tables an implementation
> nightmare (for me). I'd love to find practicle alternatives, but efficient
> binary trees also are complex and even more depend on kind of keys, I
> guess.
>
Right, I know that, but that's not what I was asking. Take this hypothetical
implementation:
struct Bucket(TKey, TVal)
{
...
}
enum hashTableSize = ...;
struct Hash(TKey, TVal)
{
Bucket!(TKey, TVal)[hashTableSize] data;
TVal get(TKey key) { ... }
void set(TKey key, TVal value) { ... }
}
I assume that D's AA are something at least vaguely like that. My questions
are:
1. What does D use for "hashTableSize"? Or does "hashTableSize" vary? If it
varies, what's a typical rough ballpark size? (And just out of curiosity, if
it varies, is it decided at compile-time, or does it change even at
runtime?)
2. What is "sizeof(Bucket(TKey, TVal))"? And I mean the shallow size, not
deep size. Is it dependent on TKey or TVal? Or is it just simply a pointer
to the start of a linked list (and therefore "sizeof(size_t)")?
More information about the Digitalmars-d-learn
mailing list