Kinds of containers

Wed Oct 21 19:13:10 PDT 2015

I needed containers, so I implemented a few:
https://github.com/bitwise-github/d-collections

Currently included:

List(T) - contiguous list like std::vector
LinkedList(T) - doubly linked list like std::list
Queue(T) - double-ended queue/contiguous circular buffer
Stack(T) - contiguous fifo stack

I would like to eventually add some kind(s) of hash table(s), but 
it's a huge topic, and I don't immediately need one.

I wrote all of the containers from scratch, so they are an 
accurate/current representation of what I would like to see.

The highlights:

1) both value and reference usage is possible.

example: <list.d>
struct ListBase { ... }   // value type list
alias RefCounted!ListBase List;   // ref counted list

List!int  list1;             // list1 is ref counted
ListBase!int  list2;    // list2 is a value type

I initially had this reversed, where List(T) was a value type, 
and ListRef(T) was a reference type, but I flipped it the other 
way to avoid unneeded copying for naive usage. I'm still not sure 
this is the right direction though. Optimization is a job of it's 
own, and it's a job for a pro. I believe the amount of effort 
spent to idiot-proof(for lack of a better term) the containers 
should be limited. Especially with the power of D's ranges, I 
think containers should generally(and in my code, usually do) 
stay put. IMO, If someone wants to pass something around, they 
should pass a range, not the container itself. My solution, 
however, does support both(ref/value), if needed.

What's currently being done in std.container.Array(T) is very 
limiting. With my approach, you have the option of value type or 
reference type. You do not have this option with the current 
"reference type" containers. Also, while you're okay if you get a 
range to the container, there is extra indirection every time you 
access the container through opIndex(), back(), front(), etc.. I 
shouldn't have to pay for this extra indirection/allocation, 
especially when the containers are about to be designed from 
scratch with this issue fully known. By embedding a reference 
count in the container, you limit the maximum achievable 
performance for what should be(IMO) an obscure use case(passing a 
container around). Again though, my solution allows both usages.

2) Cursors(iterators)

My Cursors are similar to C++ iterators, but have several key 
differences.

a) There is no begin/end/rbegin/rend/cbegin/cend/crbegin/crend... 
The cursor knows it limits, and is considered "alive" until it 
has been advanced(by one) beyond either end of the container. 
Similar to an empty range, an invalidated Cursor is gone for 
good, but can be copied ahead of time, before iteration.

b) cursors can be joined to form a range. Unlike C++ however, 
both cursors must point to elements in the container since there 
is no begin/end, and the join is inclusive. There is no "past the 
end of the container" type of Cursor to refer to. Joining two 
cursors can, at the minimum, result in a Range with a length of 1.

example:

List!int a = [1, 2, 3];
auto cur1 = a.first;
auto cur2 = a.last;
auto rng1 = cur1 ~ cur2; // join two cursors
assert(rng1.equal([1, 2, 3]));
auto rng2 = ~cur1; // convert cursor to range
assert(rng2.equal([1]));
assert(cur1 && cur1.alive);
--cur1;
assert(!cur1 && !cur1.alive);

for(auto c = a.first; c; ++c)
	writeln(*c);

My containers' find() methods return cursors, not ranges. The 
result is that code like this actually removes the element you 
intended to remove, and nothing else:

a.remove( a.find(1) );

find() should not return elements that do not match the query the 
programmer supplied as an argument, which is currently the case 
with range-based find().

I'm still considering whether the traditional C++ 
approach(begin/end/etc..) is a better idea, but haven't reach a 
conclusion. I do know, however, that ranges alone are not enough.

3) Gravy.

My containers contain, by default, most of what you'll need to 
use them. The counter argument is code-reuse, but my containers 
are lightyears away from prohibitively complex, so I think the 
way I've done things is fine. Including core functionality into 
containers allows for more efficient implementations in some 
cases. For example, a LinkedList(T) sort. This approach is also 
much friendlier to documentation and intellisense, and allows a 
programmer to find everything they need in one place.

--

There is still work to do, of course.

On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
>Design by Introspection.

I prefer simple, predictable constructs.

> 2. Reference containers.
> 3. Eager value containers.

Both.

      Bit