Array slices and copy on write

Mon Apr 4 11:30:34 PDT 2011

On 04.04.2011 15:53, Steven Schveighoffer wrote:
> On Sun, 03 Apr 2011 08:29:47 -0400, simendsjo
> <simen.endsjo at pandavre.com> wrote:
>
>> D will copy the original content if a slice expands, but is the
>> following behavior
>> implementation specific, or part of the specification?
>>
>> auto a = [0,1,2];
>> auto b = a[0..2];
>> a.length = 2; // is a[3] already marked for collection by the gc?
>> assert(a.ptr == b.ptr);
>> b.length += 1; // as b cannot reuse it even though noone is using it
>> assert(a.ptr == b.ptr); // ooops, fails.. b is reallocated
>
> There are two incorrect assumptions here:
>
> 1. The D array runtime code cannot possibly know how many references
> there are to a certain memory block, so your assertion that a[3] is
> unreferenced is not something that the compiler or runtime can know
> without horrendous performance implications.

Ah, of course. Makes sense.

> 2. The D GC collects entire memory blocks, not portions of them. There
> is no way for the GC to "collect" a[3]. It can only collect the entire
> memory block which contains a.

Given the above, I guessed so.
Then you must copy data to a new array manually in case you want to free 
excess memory..?

int[] a;
a.length = 10_000_000; // my initial guess
// fill a
a.length = 100; // only 100 used. GC won't ever free the memory not used
auto b = a[0..100].dup; // copy
a = null; // and destroy

Or is it possible to mark part this for the GC as unused?

> The array runtime code does it's best effort to expand an array into its
> own memory block. Because slicing an array or copying an array reference
> is a very fast, efficient event, it is not hooked by the array runtime
> code. Therefore, the array runtime code cannot know all the array slices
> that point to the memory block. The assumption it MUST make is that
> somebody somewhere has a pointer to the data it considers valid. If you
> think about it, the bookkeeping necessary to track how many
> pointers/arrays point to each byte of an array would dwarf the cost of
> storing and using the array in the first place.

Makes sense.

> You must remember, copy-on-expansion (note, copy-on-write is not the
> right term, that implies changing any data in the array makes a copy),
> or rather the lack of copying, is an optimization. In most cases, you
> should expect expansion can make a copy, but you shouldn't rely on that
> behavior.

Copy on expansion is a much better term, yes.

> For those cases where you want to *force* the runtime to consider data
> not referenced by an array as being 'free', you can use assumeSafeAppend:
>
> b.assumeSafeAppend(); // marks b[3] as being unused
>
> b.length += 1; // no copy made.
>
> -Steve

I cannot find assumeSafeAppend.

Thanks a lot for your very thorough and detailed answer. Cleared up a 
couple of blind spots.