Kinds of containers

Wed Oct 21 11:48:13 PDT 2015

On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis 
wrote:
> On Wednesday, 21 October 2015 at 16:36:46 UTC, Brad Anderson 
> wrote:
>> On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
>> Alexandrescu wrote:
>> [snip]
>>> 2. Reference containers.
>>>
>>> These have classic reference semantics (à la Java). 
>>> Internally, they may be implemented either as class objects 
>>> or as reference counted structs.
>>>
>>> They're by default mutable. Qualifiers should apply to them 
>>> gracefully.
>>>
>>> 3. Eager value containers.
>>>
>>> These are STL-style. Somewhat surprisingly I think these are 
>>> the worst of the pack; they expensively duplicate at the drop 
>>> of a hat and need to be carefully passed around by reference 
>>> lest performance silently drops. Nevertheless, when used as 
>>> members inside other data structures value semantics might be 
>>> the appropriate choice. Also, thinking of them as values 
>>> often makes code simpler.
>>>
>>> By default eager value containers are mutable. They should 
>>> support immutable and const meaningfully.
>>
>> Having both reference and value semantics for containers would 
>> be great. I don't understand why reference semantics would be 
>> implemented by the container themselves though. Why not a 
>> general purpose RC! (or RefCounted! if the current design is 
>> deemed sufficient) that can apply to anything, including 
>> containers? Then you'd only need to implement the value 
>> semantic containers (and maybe throw in some RC version 
>> aliases to promote the use of the RC versions so the option 
>> isn't overlooked). It seems kind of crazy that anything in D 
>> that wants to be reference counted would need to implement the 
>> logic themselves.
>>
>> If there are performance advantages (I haven't thought of any 
>> but perhaps there are) to bake the RC right into the container 
>> it might also be possible to use DbI take advantage of it in 
>> RC! when appropriate.
>>
>> It just seems so wrong to implement reference counting dozens 
>> of times independently, especially when that means 
>> implementing all the containers twice too.
>
> If we had value type containers and reference type containers, 
> I would assume that they would at least share implementation, 
> and maybe the reference types would just be wrappers around the 
> value types. However, I completely fail to understand why you'd 
> ever want a container that was a value type.

Well they are more efficient when used correctly (no inc/decs). 
Using them correctly does take some additional knowledge and 
consideration though. C++11 did take care of a whole class of 
implicit copying and RVO has long taken care of another class of 
them. I don't think it happens (in C++) quite as often as you are 
making it out to be (at least in my experience) but I'll admit I 
have had a bug or two caused by modifying a copy when I thought I 
was changing a reference. I could just as easily have bugs caused 
by modifying a reference when I thought I was modifying a copy 
though. Either approach comes with its own considerations.

> In my experience, it's very error-prone and adds no value.

Unless we end up with compiler assisted inc/dec elision they are 
always going to be faster. Perhaps marginally faster but still 
faster and the realtime guys are always going to demand the 
fastest option.

> It just makes it too easy to accidentally copy a container, and 
> it can be pretty easy to have an iterator, range, etc. 
> referring to a container that's already been destroyed (similar 
> to having a dynamic array referring to a static array that's 
> left scope).

Having the ranges of a container keep the reference count would 
make the speed advantage of the value types even more of a 
factor. That might actually change the performance advantage away 
from being just subtle to being strong. We'd need to benchmark to 
say for sure, of course.

Just a sidenote regarding iterator invalidation: C++, when 
following the new C++ Core Guidelines, is actually able to do 
static analysis of iterators for memory safety without ref count 
overhead. Herb Sutter showed a preview version of MSVC doing it 
during his keynote this month at CppCon[1]. It blew me away that 
they were about to do that with C++.

> As long as the containers have a dup method (or whatever we 
> call it) so that they can be copied when you do want to copy 
> them, I would think that that was more than enough. What do you 
> get with a value type container that you consider better than a 
> reference type? It's not like it lives on the stack as a value 
> type. Most of the container's guts are on the heap regardless.
>
> - Jonathan M Davis

All that said, I agree with you that the implicit copying isn't 
really all that desirable. Perhaps there is a better way. What if 
instead of value semantics we had unique containers (akin to 
std.typecons.Unique or unique_ptr in C++) and required people to 
always state whether it was a copy or move? That takes care of 
the error prone implicit copying while retaining the performance 
characteristics.

1. https://www.youtube.com/watch?v=hEx5DNLWGgA