resizeable arrays: T[new]

Mon Jun 4 02:27:31 PDT 2007

Sean Kelly wrote:
> Jarrett Billingsley wrote:
>> "Sean Kelly" <sean at f4.ca> wrote in message 
>> news:f3uoj5$1b1i$1 at digitalmars.com...
>>
>>> Most array algorithms would apply.  But I'm still not sure I see the 
>>> point of having an immutable reference, because it's just passed by 
>>> value anyway.  Who cares if the size of the array is modified within 
>>> a function where it's not passed by reference?  The change is just 
>>> local to the function anyway.
>>
>> If that array is pointing into some meaningful area of memory (like in 
>> the example, a texture buffer), resizing the array could (probably 
>> would) move the array around, which I guess isn't illegal but then the 
>> function operating on the array wouldn't be accessing the correct 
>> place.  Prevent them from changing the length, it prevents them from 
>> accessing anywhere but there. 
> 
> Well yeah.  I don't personally think this is a problem because it 
> doesn't affect the callee in any way, but I can see how others might 
> disagree.  Doesn't 'final' do this now though?

There was a lot of talk about this problem today. Final doesn't quite do 
it, because it's not part of the type signature of the function. But 
making it a type signature adds a whole raft of complexity, and also 
inevitably completely screws up the notion of transitivity of const and 
invariant.

(The problem is if I pass a slice to a function, and then that function 
reallocates the slice by changing the length or appending to it, it can 
affect other slices into the same data in very hard-to-explain ways. 
Essentially, we needed to find a way to disallow it. Options explored 
and found unacceptable were final (causes more problems), ref counting 
arrays (too expensive), always duping (too expensive), adding 'slice' 
bits to the fat pointers (doesn't work), adding ref counting to the 
memory chunks (too expensive), and data flow analysis hacks (unreliable, 
inconsistent). Note that other languages solve the problem with ref 
counting, but we considered that unacceptable for D because it adds 50% 
to the cost of transferring arrays around, and big runtime costs in 
manipulating the ref counts. And that doesn't even consider 
multithreading or exception safety requirements.)

Now, it turns out that it is very rare for a function to legitimately 
want to resize a buffer passed to it. So we finally hit on the idea of 
making a resizeable array a different type, say:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

    a.length = 6;   // error, static array
    b.length = 6;   // error, dynamic array is not resizeable
    c.length = 6;   // ok
    b = c;          // ok, implicit conversion of resizeable to dynamic
    c = b;          // error
    c = b[0..n];    // error
    b = c[0..n];    // ok, but watch out if c is later resized
    b = c[0..n].dup; // ok, no worries
    c = cast(T[new])b; // ok, but this will be deliberate

Note that .dup will return a T[new], as will operator new. Slicing will 
return a T[]. The runtime representation of T[new] will be identical to 
that of T[], just the type is different.

So, if our function parameter is typed as T[], we know there isn't going 
to be any monkey business going on in that function with our other 
slices. If it's T[new], we know we need to take a hard look at it.

I think this will entail only very rare changes to user code, but will 
be a big improvement in type safety. Note that it will *still* possible 
for array resizing to affect other slices into the same array, but it's 
far, far less likely to happen inadvertently. The attractiveness of this 
solution is it nearly completely solves the problem with zero runtime cost.