GSoC-2011 project:: Containers

Steven Schveighoffer schveiguy at yahoo.com
Mon Mar 28 04:53:48 PDT 2011


On Fri, 25 Mar 2011 18:37:26 -0400, Jonathan M Davis <jmdavisProg at gmx.com>  
wrote:

> On 2011-03-25 05:41, spir wrote:
>> About D collections: aside std.container in Phobos, Steven Schweighoffer
>> has a fairly advanced project called dcollections:
>> http://www.dsource.org/projects/dcollections. As I understand it, it is  
>> a
>> bit of a concurrent for std.container, but there seems to be a  
>> possibility
>> for them to converge in the future. In any case, you should definitely
>> study it, if only to take inspiration and avoid double work.
>
> dcollections is Steven Schweighoffer's project which has existed since  
> D1.
> std.container is the container module for Phobos. So, they aren't,  
> strictly
> speaking related. When designing std.container and planning out how  
> containers
> should be done in Phobos, Andrei took a different approach than Steve  
> did. So,
> nothing can be taken from dcollections and simply plopped into  
> std.container.
> However, dcollections 2.0 does use the Boost license, so the code from  
> there
> can be refactored to work in std.container. Steve already did that with
> RedBlackTree. He ported std.RedBlackTree from whatever his red-black tree
> implementation is in dcollections. So, if it makes sense, code can be  
> taken
> from dcollections and ported to Phobos (and Steve would obviously be a  
> good
> guy to talk to about that). However, anyone doing that needs to be aware  
> of
> the differences in how dcollections works vs how std.container works  
> (e.g.
> dcollections has cursors whereas std.container uses ranges exclusively).

Any of the private implementations inside dcollections are usable in  
std.container, because they do not expose any public interface.  In fact,  
dcollections' types can be separated into two categories -- interfaces and  
implementations.  The implementations have a very raw, unsafe, simple API  
(no range or cursor support there).  The interface types (which BTW are  
implemented via final classes) expose the common public interface that all  
dcollections classes have, including ranges and cursors.

It is this separation which allowed me to port red black tree to  
std.container by changing nothing in the red black node implementation.   
If you compare RBNode in std.container and RBNode in dcollections, you'll  
find them virtually identical (little cleanup here and there).  In fact, I  
plan to have dcollections' RBTree use std.container's RBNode to avoid code  
duplication.

Unfortunately, red black tree is the most complex part of dcollections, so  
there is not much else to gain by porting to std.container.  I think Deque  
would be a good one, even though it's implementation is not separate (the  
implementation is based on builtin arrays), so the port would be more  
involved.  You could also take the Link implementation (dual-linked list),  
but that is simple enough to write from scratch ;)  The Hash is extremely  
naive and basic, so I'm not sure it's worth copying.  I'm not an  
algorithms expert.

Aside from porting dcollections/implementing equivalent types, I think  
there are some things that would be good to have in phobos:

* Conceptual types that use the implementations, such as a map type.   
These *should* be implementation agnostic as long as you use template  
constraints to identify the appropriate functions required.  Doing this  
should test the completeness of the functions that the containers define.
* Custom allocation.  This has increased dcollections' performance  
significantly.

If you have any questions, do not hesitate to email me at this address.  I  
would be a mentor for this, but 1) I don't have much time (not sure what  
is required) and 2) I have a severe difference of opinion from Andrei on  
what is good in collections, I don't want to guide someone to designs/code  
that won't be accepted.

-Steve


More information about the Digitalmars-d mailing list