Array types not treated uniformly when passed as ranges

Mon Feb 14 19:59:52 PST 2011

On Monday 14 February 2011 18:18:39 jam wrote:
> Hi all,
> 
> Just curious as to the difference in the built-in variable length
> array vs. the std.container.Array and fixed length arrays when it
> comes to using them in functions that take Ranges.
> 
> For instance the following does not compile:
> 
> import std.algorithm;
> import std.stdio;
> import std.range;
> import std.conv;
> import std.container;
> import std.array;
> 
> void main() {
> 
>     int[5] builtin_fixed;
>     int[] builtin_variable;
>     Array!(int) con_array;
> 
>     con_array.length(5);
>     builtin_variable.length = 5;
> 
>     fill(builtin_variable, 9); //ok, no error
>     isSorted(builtin_variable); //ditto
> 
>     //The following 4 statements produce errors
>     fill(builtin_fixed, 9);
>     fill(con_array, 9);
> 
>     isSorted(con_array);
>     isSorted(builtin_fixed);
> 
> }
> 
> The errors are variations on:
> 
> Error: template std.algorithm.fill(Range,Value) if
> (isForwardRange!(Range) && is(typeof(range.front = filler))) does not
> match any function template declaration
> Error: template std.algorithm.fill(Range,Value) if
> (isForwardRange!(Range) && is(typeof(range.front = filler))) cannot
> deduce template function from argument types !()(int[5LU],int)
> 
> If I change those 4 statements to:
> 
>     fill(builtin_fixed[], 9);
>     fill(con_array[], 9);
> 
>     isSorted(con_array[]);
>     isSorted(builtin_fixed[]);
> 
> effectively passing ranges (std.container.Array!(int).Array.Range in
> the case of con_array, and int[] for builtin_fixed) which then works
> as expected.  This all makes sense, and it's easy enough to write
> wrappers,   but I would (well and I did) expect the first way to just
> work.   This may just be a nitpick I guess, but being new to the
> language this little detour involved quite a bit of time research (not
> a bad thing, I did learn quite a bit in the process), but makes me
> wonder if I am missing something fundamental regarding when I should
> be using these different array types.

It's because an Array is not a range. Dynamic arrays are a bit special in that 
they're both a container and a range. An Array is just a container. But 
honestly, you wouldn't really want it to work.

When you pass a dynamic array, it creates a new array which references the old 
one. That means that as long as you don't increase the size of either array or 
assign another array to one of them, they will continue to point to the same 
data and altering one's elements will alter the others, since their elements are 
the same. However, increasing the size of either array might cause a 
reallocation (it might not if there's room passed the end of the array) and they 
would then point to two separate sets of data. This means that you can pass an 
array to a range-based function and it can continually use popFront or popBack 
to take a smaller and smaller slice of the array. The elements can be altered if 
that's what the algorithm does, but altering the size of the passed in array 
(which popFront and popBack would obviously do) does _not_ alter the size of the 
original array. So, it's safe for the range-based function to treat the passed 
in array as a range without altering the original array.

However, this won't work for containers in general. If you pass an Array, you're 
passing the container. The passed in Array is identical to the original 
(currently, Array is a struct with reference semantics thanks to internal 
reference counting, but soon it will become a class and more obviously have 
reference semantics). As such, if you alter the passed in Array, you alter the 
original. So, if you were to continuously pop the front off of the passed in 
Array, you would be continually popping the front off of the original Array. So, 
a function like find would actually remove the front part of your Array as it 
processed it, and if what you were trying to find wasn't there, you'd have an 
empty Array. The way to fix this is to not have Array be a range. Instead, you 
have a helper type which is a range over it. You get that with the slice 
operator.

You don't have the problem with arrays that you'd have with user-defined 
container types, because the semantics of arrays are a bit odd when you copy 
them. So, if anything were faulty, it would be the built in arrays, not user-
defined container types like Array. It is quite handy to have arrays work how 
they work, however, so that's not likely to change.

So, if you want consistency, use [] even when using arrays. It's the same thing 
really. The resulting code probably isn't even any different, and you have to do 
that with static arrays anyway. So, it's just plain more generic to always slice 
when passing to a range function.

- Jonathan M Davis