Memory usage of AAs?

Nick Sabalausky a at a.a
Tue Mar 29 19:20:05 PDT 2011


"spir" <denis.spir at gmail.com> wrote in message 
news:mailman.2909.1301443345.4748.digitalmars-d-learn at puremagic.com...
> On 03/30/2011 01:24 AM, Nick Sabalausky wrote:
>> My understanding of hash tables is that they allocate a fixed size array 
>> and
>> map keys to indicies within the range 0..predefined_length_of_the_AA.
>>
>> So I've been wondering, how many elements do D's built-in AAs have? And
>> what's the content of each one, just a single pointer?
>
> Each element is a data structure, often called bucket (typically a link 
> list), storing (key:value) pairs for which the key, once hashed and 
> modulo-ed, maps to the given index. That's why the famous O(1) lookup time 
> for hash tables is very theoretic: the constant part holds average time 
> for hashing which is very variable, plus another average time for linearly 
> traversing the bucket. The latter part depends on table load factor (= 
> number of elements / number of buckets) and proper statistical repartition 
> of keys into buckets.
>
> The key (!) points are finding a good hash func to "linearize" said 
> repartition (which depends on actual key-data domain, not only on their 
> type...), but better ones rapidly eat much time (ones used in practice are 
> rather simple & fast); and finding optimal load factor, and growth scheme. 
> In practice, all of this tends to make hash tables an implementation 
> nightmare (for me). I'd love to find practicle alternatives, but efficient 
> binary trees also are complex and even more depend on kind of keys, I 
> guess.
>

Right, I know that, but that's not what I was asking. Take this hypothetical 
implementation:

struct Bucket(TKey, TVal)
{
    ...
}

enum hashTableSize = ...;

struct Hash(TKey, TVal)
{
    Bucket!(TKey, TVal)[hashTableSize] data;

    TVal get(TKey key) { ... }
    void set(TKey key, TVal value) { ... }
}

I assume that D's AA are something at least vaguely like that. My questions 
are:

1. What does D use for "hashTableSize"? Or does "hashTableSize" vary? If it 
varies, what's a typical rough ballpark size? (And just out of curiosity, if 
it varies, is it decided at compile-time, or does it change even at 
runtime?)

2. What is "sizeof(Bucket(TKey, TVal))"? And I mean the shallow size, not 
deep size. Is it dependent on TKey or TVal? Or is it just simply a pointer 
to the start of a linked list (and therefore "sizeof(size_t)")?





More information about the Digitalmars-d-learn mailing list