Cost of assoc array?
John Colvin via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed May 14 06:44:39 PDT 2014
On Wednesday, 14 May 2014 at 13:20:40 UTC, Chris wrote:
> On Wednesday, 14 May 2014 at 11:13:10 UTC, John Colvin wrote:
>> On Wednesday, 14 May 2014 at 09:38:10 UTC, Chris wrote:
>>> I have code that uses the following:
>>>
>>> string[][size_t] myArray;
>>>
>>> 1. myArray = [0:["t", "o", "m"], 1:["s", "m", "i", "th"]];
>>>
>>> However, I've found out that I never need an assoc array and
>>> a "linear" array would be just fine, as in
>>>
>>> 2. myArray = [["t", "o", "m"], ["s", "m", "i", "th"]];
>>>
>>> Is there any huge difference as regards performance and
>>> memory footprint between the two? Or is 2. basically 1. under
>>> the hood?
>>
>> If you don't need the features of associative arrays, don't
>> use them.
>>
>> Normal arrays are much simpler, faster and (due to some
>> outstanding problems with associative arrays in D) less
>> bug-prone. Associative arrays, by definition, require a lot
>> more work behind the scenes for both reading and writing.
>
> The question is, if they are _much_ faster. With this type
> (string[][size_t]) I haven't encountered any bugs yet. On the
> other hand, it introduces some rather stilted logic sometimes,
> as in
>
> foreach (size_t i; 0..myArray.length) {
> // do something with myArray[i];
> }
>
> because it's not sorted. This, or I sort it first. Anyway,
> there's always an overhead associated with associative arrays.
> I'll have to see how big this breaking change would be, and
> decide, if it's worth it.
>
> Profiling is not really feasible, because for this to work
> properly, I would have to introduce the change first to be able
> to compare both. Nothing worse than carefully changing things
> only to find out, it doesn't really speed up things.
Yes, they are much faster. Normal array indexing is equivalent to
*(myArray.ptr + index) plus an optional bounds check, whereas
associative array indexing is a much, much larger job.
Why were you using associative arrays in the first place? Unless
your keys are somehow sparse* or of a non-integer type there
isn't any reason to.
* How I see that constraint in that context:
(maxKey - minKey) / nElements > 1 + epsilon
where epsilon is the maximum proportional wasted space you could
afford in a normal array (emptyElements / usedElements). Bear in
mind the memory overhead of associative arrays is itself non-zero.
Also, while normal arrays tend to be more cache friendly than
associative arrays, this isn't true for very sparse arrays.
More information about the Digitalmars-d-learn
mailing list