Passing dynamic arrays

Tue Nov 9 12:13:55 PST 2010

Steven Schveighoffer Wrote:

> On Tue, 09 Nov 2010 08:14:40 -0500, Pillsy <pillsbury at gmail.com> wrote:
[...]
> > Ah! This is a lot of what was confusing me about arrays; I still thought  
> > they had this behavior. The fact that they don't makes me a good deal  
> > more comfortable with them, though I still don't like the  
> > non-deterministic way that they may copy their elements or they may  
> > share structure after you append stuff to them.

> As I said before, this rarely affects code.  The common cases I've seen:

> 1. You append to an array and return it.
> 2. You modify data in the array.
> 3. You use a passed in array as a buffer, which means you overwrite the  
> array, and then start appending when it runs out of space.

> I don't ever remember seeing:

> You append to an array, then go back and modify the first few bytes of the  
> array.

I've certainly encountered situations in at least one other language where standard library functions will return mutable arrays which may or may not share structure with their inputs. This has been such a frequent source of pain when using that language that I tend to react very negatively to the possibility in any context.

> Let's assume this is a very common thing and absolutely needs to be  
> addressed.  What would you like the behavior to be? 

Using a different, library type for a buffer you can append to. I think of "a buffer or abstract list you can cheaply append to" as a different sort of type from a fixed size buffer anyway, since it so often is a different type. Arrays/slices are a very basic type in D, and I'm generally thinking that giving your basic types simpler, easier to understand semantics is worth paying a modest cost.
[...]
> IMO, the benefits of just being able to append to an array any time you  
> want without having to set up some special type far outweighs this little  
> quirk that almost nobody encounters.  You can append to *any* array, no  
> matter where the data is located, or whether the data is a slice, and it  
> just works.  I can't see how anyone would prefer another solution!

There's a difference between appending and appending in place. The problem with not appending in place (and arrays not having the possibility of a reserve that's larger than the actual amount, of course) is one of efficiency. Having

auto s = "foo";
s ~= "bar";

result in a new array being allocated that is of length 6 and contains "foobar", and assigning that array to `s`, is obviously useful and desirable behavior. If the expansion can happen in place, that's a perfectly reasonable performance optimization to have in the case of strings or other immutable arrays. Indeed, one of the reasons that functional programming and GC go together like peanut butter and jelly is that together they let you get all sorts of wins in terms of efficiency from shared structure. 

However, I've found working with languages that mix a lot of imperative and functional constructs (Lisp is one, but not the only one) that if you're going to do this, it's really very important that there not be any doubt about when mutable state is shared and when it isn't. D is trying to be that same kind of multi-paradigm language. This means that, for mutable arrays, having

int[] x = [1, 2, 3];
x ~= [4, 5, 6];

maybe reallocate and maybe not seems like it's only really there to protect people from doing inefficient things by accident when they append onto the back of an array repeatedly (or to make that admittedly common case more convenient). This really doesn't strike me as worth the trouble. Like I said elsewhere, the uncertainty gives me the screaming willies. 

Cheers,
Pillsy