A few assiociative arrays measurements
Guillaume Piolat
first.name at gmail.com
Thu Feb 8 14:01:54 UTC 2024
Lately, I got a large speed-up using a N-heap rather than a
binary heap in a priority queue. But it's not the only binary
trees I had, Phobos' Red-Black tree is like that.
Could we stuff more items per tree node?
As memory gets relatively more expensive than CPU, does that wins
some precious cycles?
Looking at the Map I was using, I've compared a few different
equivalent for string[int] / Map!(int, string) / associative
arrays.
- one is the regular builtin AA, a hashmap (druntime)
- one is based upon Phobos red-black tree (rbtree.d), adapted to
use manual allocation
- one is based on a new Boost-licensed, B-tree implementation in
Dplug (btree.d)
- one is the TreeMap in emsi-container package, itself using a
T-Tree (treemap.d, ttree.d)
The results are summed up in this presentation:
https://dplug.org/tutorials/Dplug%20Tutorials%2018%20-%20The%20Case%20Against%20Binary%20Trees.pdf
Notes:
- For an in-memory collection, a B-Tree seems generally more
performant than a Red-Black tree nowadays. In fact I believe any
tree with more than 2 children will outperform a tree with only 2
children per node.
- However Red-Black Tree are used because C++ std::map must be
able to insert without invalidating existing iterators. It's
possible with RB-Tree but not B-Tree.
- In my benchmarks I don't understand why `emsi_containers`
T-Tree act like this at look-up; it looks slower than one could
expect. Bug? So I wasn't able to rightfully compare T-Tree with
B-Tree. It shouldn't be the GC, but possibly it's adding the GC
roots that do this?
- builtin AA uses a lot of tricks in the book and, as a hashmap,
can outperform the others who are instead sorted collections. But
not necessarily at look-up.
More information about the Digitalmars-d
mailing list