Yet another strike against the current AA implementation
dsimcha
dsimcha at yahoo.com
Sun Apr 26 08:21:54 PDT 2009
So I tried to create a struct to allow iteration over the keys or values of a
builtin associative array using a lazy forward range. This would allow one to
pass the keys or values of an AA to any function that expects a regular old
forward range w/o any heap activity, eager copying, etc. and with O(1)
auxiliary memory usage. Now that ranges are the lingua franca of iteration in
D, this seems like a pretty necessary feature.
The problem is that D's current AAs are based on a hash table with collision
resolution done using binary trees. I guess this decision was made to allow
for O(log N) worst case performance even when there are ridiculous amounts of
hash collisions. However, this is a very bad decision because, in exchange
for more bulletproof performance in corner cases, you lose:
1. Every element is now (void*).sizeof bytes larger because you have to store
a left and right pointer.
2. You can't iterate over a binary tree in O(1) auxiliary space, at least not
in any way I'm aware of. This means that a lazy range to iterate over the
keys or values is relatively useless because it would generate heap activity
anyhow to store the stack or queue necessary to iterate over the tree, so you
may as well just use the current .keys or .values properties, which just
return an array. (Because of the way that ranges are designed, you can't just
use the call stack, unless you use fibers or something, which is obviously not
a win from an efficiency point of view.)
Both of these are important in the more common, general case of sane hashing
performance.
More information about the Digitalmars-d
mailing list