RFC on range design for D2

Mon Sep 8 19:53:19 PDT 2008

Robert Jacques wrote:
> On Mon, 08 Sep 2008 20:24:27 -0400, Andrei Alexandrescu 
> <SeeWebsiteForEmail at erdani.org> wrote:
> 
>> About the "collection should be a range itself" mantra, I've had a 
>> micro-epiphany. Since D's slices so nicely model at the same time 
>> arrays and their ranges, it is very seductive to think of carrying 
>> that to other collection types. But I got disabused of that notion as 
>> soon as I wanted to define a less simple data structure. Consider a 
>> matrix:
>>
>> auto a = BlockMatrix!(float, 3)(100, 200, 300);
>>
>> defines a block contiguous matrix of three dimensions with the 
>> respective sizes. Now a should be the matrix AND its range at the same 
>> time. But what's "the range" of a matrix? Oops. As soon as you start 
>> to think of it, so many darn ranges come to mind.
>>
>> * flat: all elements in one shot in an arbitrary order
>>
>> * dimension-wise: iterate over a given dimension
>>
>> * subspace: iterate over a "slice" of the matrix with fewer dimensions
>>
>> * diagonal: scan the matrix from one corner to the opposite corner
>>
>> I guess there are some more. So before long I realized that the most 
>> gainful design is this:
>>
>> a) A matrix owns its stuff and is preoccupied with storage internals, 
>> allocation, and the such.
>>
>> b) The matrix defines as many range types as it wants.
>>
>> c) Users use the ranges.
>>
>> For example:
>>
>> foreach (ref e; a.flat) e *= 1.1;
>> foreach (row; a.dim(0)) row[0, 0] = 0;
>> foreach (col; a.dim(1)) col[1, 1] *= 5;
> 
> I'd recommend a more clear cut example. Three of the ranges are very 
> well defined in array languages and libraries. Essentially a slice of a 
> matrix is another matrix that may have less or more dimensions and 
> therefore may be a collection in addition to a range.

There are two problems with the view that a slice of a matrix is also a 
matrix:

1. If ownership is desired, then the slice does not own anything in the 
matrix, so that does not put it on equal footing with the matrix it 
started with.

2. Slicing a block matrix on a hyperplane will result in a strided 
range. That is not a block matrix at all. So again it is more useful to 
think of the block matrix as the store, and of various ranges crawling 
over it as ways to look at the matrix.

I agree that ranges could be shoehorned into working. But then all 
ownership is out of the window, and also it creates more confusion than 
it clears. Now for n possible ranges over a block matrix, you have a 
daunting task ahead. You'd need to be able to construct any from any 
other, otherwise you could get stuck with a range that's a sort of dead end.

That could be solved in a number of ways (e.g. force all range2range 
conversions to go through some "central" range) but by that time you 
start to realize that that design is not quite gainful. Besides, it only 
takes care of matrices, but not of various other nonlinear structures

The approach in which the container is preoccupied with storage and 
ownership, and it offers various ranges that view the container in 
various ways, sounds like the better design to me.

> The dimension-wise 
> range is the only operation which is more complex, due to the type and 
> dimensions of the returned array changing float[x,y,z] -> float[x,y][z]. 
> And the main argument is that a float[x,y][z] is large, slow to create 
> and unwanted, so a separate range/generator is better (Also note a 
> generator can provide implicit the head const, tail mutable nature of 
> the range). Even given this, it doesn't contratict the "collection 
> should be a range itself" mantra, since there is a very well defined 
> range which encompasses the data, its just that some ranges are more 
> optimal if they're only views, and not a root collection.

Again, I agree a range-only view can be shoehorned into working. I just 
think it would be a bad design.

Andrei