shallow copy of const(Object)[]

Sat Nov 1 17:33:42 PDT 2014

On Saturday, November 01, 2014 10:30:05 anonymous via Digitalmars-d-learn 
wrote:
> On Saturday, 1 November 2014 at 00:08:23 UTC, Jonathan M Davis
>
> via Digitalmars-d-learn wrote:
> > So, by shallow copy, you mean that you want an array that
> > contains the same
> > elements but is a new array?
>
> yes
>
> > If that's what you want, just slice the array.
> >
> > auto b = a[];
>
> This is the same as `auto b = a;`.
>
> > Sure, they'll point to the same spot in memory still, but
> > because all of the
> > elements are const, you can't mutate them, so it really doesn't
> > matter. All it
> > affects is that because both arrays have the same block of
> > memory after them,
> > appending to one of them would make it so that appending to the
> > other would
> > force it to reallocate. For for most intents and purposes, you
> > essentially
> > have two different arrays.
>
> Except I don't. The elements are not immutable, so they may be
> pulled from under my feet from elsewhere. And more importantly,
> `a` may refer to non-GC memory. Like the stack, for example:
>
> class C
> {
>       const(Object)[] objects;
>       this(const(Object)[] objects ...)
>       {
>           this.objects = objects;
>       }
> }
> C f()
> {
>       return new C(new Object);
> }
> void main()
> {
>       auto c = f();
>       auto o = c.objects[0];
>       f(); /* writing over the previous array */
>       assert(o is c.objects[0]); /* fails */
> }

Okay. Dynamic arrays don't care what kind of memory backs them. If you have

auto b = a;

or

auto b = a[];

you end up with two different dynamic arrays which are slices of the same
memory. Appending to one does not affect the other, but as long as they're
slices of the same memory, then altering the elements of one will affect the
other, which is impossible if the element type is const or immutable. However,
if a were a const slice of an array of mutable elements

Object[] x;
const(Object)[] a = x;

then because that array has mutable elements, then mutating those elements
would mutate a and b. However, the type of memory that's backing the arrays is
irrelevant. It could be GC-allocated, or malloc-allocated, or even a static
array. The only 2 differences that you have if the memory is not GC-allocated
are

1. It is guaranteed that the extra capacity in the array is 0, meaning that
any append or reserve operations will reallocate and make it a GC-allocated
array, whereas a GC-allocated array could have extra capacity and avoid being
reallocated.

2. It's possible that the memory is freed, screwing up the dynamic array.
That's why slicing a static array or pointers should be @system. Slicing
pointers is, but unfortunately, slicing a static array is not currently
treated as being @system ( https://issues.dlang.org/show_bug.cgi?id=8838 ).
Regardless, it's up to the code that's doing the slicing to make sure that it
doesn't give the slice to anything that's going to keep it around beyond the
lifetime of the memory.

So, while I can see why having a dynamic array backed by a static one (or
malloc-ed memory) would make it so that you want to make sure that you've
copied the array's elements into a new block of memory, it's not actually
relevant to discussing how such a copy would be made.

If you want to make sure that a dynamic array refers to new memory and is not
a slice of another one, then you'd typically use dup or idup, and in almost
all cases, that's exactly what you want. However, you have the rather odd case
of trying to make sure that you end up with a new block of memory, but you're
dealing with an array of const reference types. Most of the time when dealing
with const, you're dealing with it, because the function took const so that it
could operate on both mutable and immutable types, and you wouldn't be trying
to create a new array from those values - doing so would then lock in const
instead of mutable or immutable, which tends to be limiting. It's generally
discouraged to keep stuff around as const beyond simply operating on it that
way to avoid having to duplicate code or to give code temporary access to
something without allowing it to mutate it.. And most functions that would
have to allocate a new array would be templated so that the original constness
could be retained (e.g. Foo[] or immutable(Foo)[]) rather than having to
allocate a const(Foo)[]. If Foo is a value type, then const(Foo)[] is
pointless anyway, because you've copied the elements. However, with Foo being
a reference type - as in your case - that's not going to work.

If you really do need to make a copy of the memory, then you're going to need
to actually end up with const(Object)[], because you can't convert
const(Object) to Object or immutable(Object)[] without doing a deep copy
(which is why dup and idup are failing for you). If we had cdup, then that
would presumably do what you want, but since in most cases, it would be
discouraged to be allocating a dynamic array of const elements (rather than
slicing it or allocating one with mutable or immutable elements), I doubt that
we really want to add cdup or that Walter would think that it's a good idea.
std.array.array _should_ work with that however, since it's just allocating an
array with the same element type as the element type of the range that it's
given (which could be an array). However, it clearly is buggy in this case,
pretty much only leaving you with something like I showed you before:

const(Object)[] b;
b.reserve(a);
foreach(e; a)
    b ~= e;

though you could use Appender instead to make it somewhat faster. But again,
in most cases, you'd just slice the array. And if the reason that you don't
want to do that is that another slice might contain mutable elements,
allocating a new array will only protect you from the array element itself
being mutated (i.e. the reference in this case), whereas because you're
dealing with a reference type, it doesn't protect you at all from the object
being mutated. For that, you need a deep copy.

But regardless, a can't mutate any of the elements or what they refer to any
more than b can (which is why I was saying that you should just slice a and
not bother with allocating), so the only reason to allocate new memory for b
to avoid mutation is if a was ultimately sliced from an array of mutable
elements.

- Jonathan M Davis