Right after allocators: containers or database connectivity?
Brad Anderson via Digitalmars-d
digitalmars-d at puremagic.com
Tue Jun 9 11:49:02 PDT 2015
On Tuesday, 9 June 2015 at 17:05:19 UTC, Andrei Alexandrescu
wrote:
> [snip]
> One would be a good pass of std.container, in particular (a) a
> design review with the DbI glasses on; (b) better documentation
> - sadly it seems to me so inadequate as to make containers
> themselves unusable; (c) investigate use of UFCS -
> std.container's design predates UFCS yet is a perfect fit for
> it, and most likely other cool language improvements we've
> added since.
>
> [snip]
Containers without a doubt. They are so fundamental to my day to
day programming it's amazing that D has gone for so long without
a good set of them (dynamic arrays and associative arrays do go a
long way though).
I might as well share some thoughts on it.
Containers that should be in the standard library in order of
preference (from my C++ STL and Boost experience primarily):
1. vector/array
2. hash map
3. hash set
4. flat map
5. flat set
6. map
7. set
8. deque
9. stack
10. queue
11. linked list
12. hash multimap
13. hash multiset
14. flat multimap
15. flat multiset
16. multimap
17. multiset
18. priority queue
We already have Array for 1 but the interface needs improvement
(I seem to recall having trouble with mutating items during
iteration or something like that which caused me to go back to
dynamic arrays). I don't think I like that it's RefCounted
either. r-value references have made C++'s vector much more
pleasant to efficiently work with in C++. We also have map in the
form of RedBlackTree. I think you might be able to use
RedBlackTree for a set too but I haven't tried it. I think
RedBlackTree isn't a very good name though. It's long and is an
implementation detail.
We have hash maps in the form of the built-in associative arrays
but something that uses the allocators rather than GC is needed.
"Flat" refers to containers that are backed by sorted arrays
rather than something like a Red-Black Tree. Insertion speed
suffers but in practice, thanks to cache locality and move
semantics and fast contiguous memory copying, they are generally
much faster than the traditional map backed by something like a
red-black tree for the majority of operations. Andrei put them in
loki so I'm preaching to the choir here but for anyone that
doesn't know about their advantages here's an article:
http://lafstern.org/matt/col1.pdf and the Boost container
library's rationale:
http://www.boost.org/doc/libs/1_56_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx
Chandler Carruth spent a lot of time talking about this stuff at
CppCon 2014 as well: https://www.youtube.com/watch?v=fHNmRkzxHWs
As far as more exotic containers go, I rather like
Boost.MultiIndex which lets you have a container with several
different characteristics (e.g., a single container where you can
look up based on name, based on id, based on insertion order,
based on lexicographic order of name, etc.). You basically design
the perfect container for your needs ("I need random access and
fast lookups and a sorted list and bidirectional lookups between
keys and values"). It's the swiss army knife of containers. I
don't need it often but when I do I love it. I believe someone
said they implemented it for D but I haven't looked into it.
Although less versatile, building multi-indexing into most
containers could prove useful too. Spitballing an example with
plenty of room for improvement:
struct A { string name; int id; }
Map!(A, ((a, b) => a.name < b.name),
((a, b) => a.id < b.id)) a_map;
a_map.insert(A("bob", 7));
assert(a_map.index!0["bob"].id == 7);
assert(a_map.index!1[7].name == "bob");
A trie would be nice to have even though I rarely use them.
Boost has a stable_vector as well which is basically just an
array that uses indirection to keep iterators valid through
insertions/deletions. I could see it being useful but I haven't
used it.
More information about the Digitalmars-d
mailing list