Kinds of containers

Wed Oct 21 14:02:26 PDT 2015

On Wednesday, 21 October 2015 at 20:13:31 UTC, jmh530 wrote:
> On Wednesday, 21 October 2015 at 18:38:10 UTC, Jonathan M Davis 
> wrote:
>>
>> A static array is of a fixed size, which almost no other 
>> containers are. It also lives entirely on the stack, which 
>> almost no other containers do. If there's a container that 
>> lives entirely on the stack, then maybe it would make sense 
>> for it to be a value type, but _very_ few containers fall in 
>> that category, and all of the classic containers like vector, 
>> linked list, map, etc. have no business being value types 
>> IMHO. It's just error-prone. Heck, static arrays are quite 
>> error-prone thanks to the fact that they convert to dynamic 
>> arrays, but they do serve a purpose. So, maybe there are 
>> containers that fall in the same category, but I expect that 
>> such containers are pretty obviously value types and not 
>> reference types, because their nature makes them that way. 
>> Regardless, I don't see how it's reasonable in general to make 
>> a container be a value type. It's just asking for trouble. If 
>> there's any question at all whether a container should be a 
>> value type or a reference type, IMHO, it should be a reference 
>> type.
>>
>
> I do find myself mixing up static and dynamic arrays...
>
> I'll generally defer to you. I suppose within the context of 
> Andrei working on containers with std.allocator, then 
> containers that only live on the stack seems beyond the scope 
> anyway.
>
> Your statement just had struck me as weird as I had thought of 
> people like manu saying at various points that they only kept 
> data structures on the stack. My next thought was that static 
> arrays are usually on the stack and are value types. So I 
> figured that whatever the performance people are using were 
> similar.

Static arrays are a bit of a special case, because they're 
essentially a chunk of memory on the stack. They can be extremely 
useful and efficient, but you have to be careful with them. Most 
containers don't act anything like them.

But like I said, there are some container types which inherently 
make sense on a stack, but most don't. For the most part, the 
kinds of containers that we're talking about here - vectors/array 
lists, linked lists, maps, sets, etc. - have to have most of 
their guts living on the heap. You'd have to _really_ contort 
things to put them entirely on the stack, and you'd be seriously 
limiting how many elements that they could hold if you did. If 
you make a vector or a linked list a value type, then it'll have 
a few members on the stack, but most of them will be on the heap, 
and it'll be a value type only because its postblit constructor 
explicitly copies the portion that's on the heap. It's not really 
naturally a value type. It's naturally more of a reference type 
or pseudo-reference type.

If the vector is implemented as a reference type, then the few 
member variables that were on the stack in the value type version 
end up on the heap with the rest of the container's guts, so you 
use slightly more memory for them, but it's much more natural 
that way, since the main guts of the container were on the heap 
anyway. And if the container is a value type, you risk copying 
unnecessarily and/or keeping around 
references/pointers/iterators/ranges to its elements after it's 
been destroyed, whereas with a reference type, it'll only be 
copied when you specifically request it. Treating them as value 
types is inefficient, error-prone, and unnatural.

Now, folks like Manu do do some crazy stuff eke out every bit of 
performance that they can, and they do sometimes use stuff like 
alloca to put stuff that would otherwise be on the heap on the 
stack. But a lot of what they do isn't very general purpose 
either, and they generally avoid stuff like the containers in 
standard libraries, because they use the heap too much and/or 
aren't tailored to their needs. They also tend to do stuff that's 
way more error-prone in an effort to eke out that extra bit of 
performance.

If we can do stuff to support them, then great, but in general - 
especially when you're talking about the library and not the 
language - the kinds of stuff that they want don't work for other 
folks. If someone like Manu wants an efficient container, he 
_might_ want one that's not a full-on reference type simply so 
that less of it is on the heap, but he sure isn't going to want 
to copy it except when he absolutely has to. So, the fact that 
it's a value type isn't necessarily useful. Having it simply live 
where it's constructed (be it on the stack or as a member 
variable) with only the stuff that has to be on the heap on the 
heap but have the container have a disabled postblit would 
probably be just as useful in most cases, and scoped or region 
allactors should be able to help with that. I'm not super 
familiar with all of the tricks that they pull to make their 
container use efficient, but I have a hard time seeing how 
std::vector's semantics are desirable to them. Certainly, when 
they care about efficiency to the point that they're using static 
arrays everywhere, the fact that vector has its guts on the heap 
means that it's in a different class of containers entirely from 
static arrays and does not have the same desirable properties, 
even if it is a value type like static arrays. And even then, the 
fact that static arrays are value types isn't really the gain. 
It's more that they're on the stack, and that naturally turns 
them into value types.

In any case, I'm pretty sure that we do have a region allocator 
in std.experimental.allocator, so if someone wants to try and put 
a container that normally lives on the heap on the stack, it 
should be possible (at least, with a limited number of elements). 
And it might actually be easier to do that with a container 
that's entirely on the heap instead of part of it living on the 
stack and part of it living on the heap (as occurs with the C++ 
containers).

If Manu (or others like him) have strong feelings about what 
kinds of containers would be best for them, they can speak up and 
point out what they need, but I don't see how having value type 
containers would generally be useful to them except in the cases 
where the container actually, fully lives on the stack (which is 
almost never the case), and it seems even more questionable for 
everyone else.

- Jonathan M Davis