Variant and large structs / static arrays.

Sun Oct 13 23:30:30 PDT 2013

Recently I was trying to fix a bug in std.variant that prevents 
static arrays larger than the max size of the Variant being used. 
The basic fix is simple, essentially just allocating T.sizeof * 
length bytes on the heap and using that as the array.

The problem is assigning instances of Variant to each other. The 
current approach is that value types behave as values and 
reference types behave as references. That is, assigning a 
Variant to an int[4], assigning another Variant to the original, 
and then modifying either one will not result in the other being 
modified. But with the solution of allocating it on the heap (and 
I believe large structs probably have this problem as well, but 
it's not visible since you can't modify them within the variant 
at the moment), assigning a variant to another would result in 
both storing the same array and thus modifying one would modify 
the other.

One solution would be to perform a copy when assigning variants, 
but this seems like it could be a large hidden performance hit. 
Every time you pass a variant containing a large static array 
into a function, every time you return one, or every time you 
assign a variable to another, it would have to allocate on the 
heap and copy the contents. This could be made somewhat less 
severe by storing whether the array is a reference of an existing 
or any variants have references to the underlying array, then 
creating a copy only if peek, opApply, op____Assign, etc are 
called. This gets complicated quickly and still comes with the 
original hidden allocation costs. Of course, an initial heap 
allocation is still required to store the large static array 
within the Variant anyways. The documentation does not currently 
state what happens when a variant is assigned to another variant, 
or in which situations underlying data is treated as a reference.

Simply leaving it as passing by references could get quite 
confusing, even if documented. For example:

int[SIZE] data = [1, 2, ..., SIZE];
Variant v = data;
Variant v2 = v;
v[1] = 999;
assert(v2 != v);

With SIZE <= 4, the assertion would pass. With SIZE > 4, the 
assertion would fail. In both situations the original data would 
remain unmodified. Is there a better solution that would not 
involve these reference vs value semantics or hidden allocations? 
Using the copy solution, either 'Variant v2 = v' would involve a 
GC heap allocation or else 'v[1] = 999' would, depending on the 
approach used.