RFC on range design for D2

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Sep 9 04:06:55 PDT 2008


Robert Jacques wrote:
> On Mon, 08 Sep 2008 23:53:17 -0400, Andrei Alexandrescu 
> <SeeWebsiteForEmail at erdani.org> wrote:
> 
>> Robert Jacques wrote:
>>> On Mon, 08 Sep 2008 20:37:41 -0400, Andrei Alexandrescu 
>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>> Denis Koroskin wrote:
>>>>> 3) Walter mentioned that built-in array could be re-implemented 
>>>>> using a pair of pointers instead of ptr+length. Will it ever get a 
>>>>> green light? It fits range concept much better.
>>>>
>>>> Walter told me to first implement my design, and if it works, he'll 
>>>> do the change. Yes, it does fit ranges much better because the 
>>>> often-used next and, um, pop will only touch one word instead of two.
>>>  I'd warn that changing away from ptr+length would create logical 
>>> incosistencies between 1D arrays and 2D/3D/ND arrays.
>>
>> How so?
> 
> An ND array is typically defined as a fat pointer like so:
> struct array(T,size_t N) {
>     T* ptr;
>     size_t[N]    lengths;      // of each dimension
>     ptrdiff_t[N] byte_strides; // of each dimension
> }
> So a 1D array is
> {
>     T* ptr;
>     size_t lengths;
>     ptrdiff_t byte_strides = T.sizeof; //Currently a compile time 
> constant in the built-in array
>     size_t length() { return lengths; }
> }
> which is logically consistent with a general dense matrix and aside from 
> some name change and the stride being a compile time constant, is 
> identical to the current D arrays.
> However, { T* first; T* last } may not be logically extended to ND 
> arrays, particularly sliced ND arrays, as T* last no longer has any 
> meaning.

Hmmm, I see. That could become a problem if we wanted lower-dimensional 
matrices to be prefixes of higher-dimensional matrices. This is a worthy 
goal, but one that my matrices don't pursue.

>>>>> 4) We need some way of supporting dollar notation in user 
>>>>> containers. The hack of using __dollar is bad (although it works).
>>>>
>>>> It doesn't work for multiple dimensions. There should be an 
>>>> opDollar(uint dim) that gives the library information on which 
>>>> argument count it occured in. Consider:
>>>>
>>>> auto x = matrix[$-1, $-1];
>>>>
>>>> Here the dollar's occurrences have different meanings. A good start 
>>>> would be to expand the above into:
>>>>
>>>> auto x = matrix[matrix.opDollar(0)-1, matrix.opDollar(1)-1];
>>>  I'd also add that multiple dimension slicing should be supported. i.e.
>>> auto x = matrix[2..5,0..$,3]
>>> would become
>>> auto x = 
>>> matrix.opSlice(Slice!(size_t)(2,5),Slice!(size_t)(0,matrix.opDollar(0)),3) 
>>>
>>> with
>>> struct Slice (T) { T start; T end; }
>>> Strided slices would also be nice. i.e. matrix[0..$:10] // decimate 
>>> the array
>>
>> Multidimensional slicing can be implemented with staggered indexing:
>>
>> matrix[2..5][0..$][3]
> 
> Yes, but doing so utilizes expression templates and is relatively slow:
> matrix_row_slice temp1 = matrix.opSlice(2,5);
> matrix_col_slice temp2 = temp1.opSlice(0,$);
> matrix                 = temp2.opIndex(3);
> 
> And causes code bloat. Worst matrix[2..5] by itself would be an unstable 
> type. Either foo(matrix[2..5]) would not compile or it would generate 
> code bloat and hard to find logic bugs. (Due to the fact that you've 
> embedded the dimension of the slice operation into the type).

What is an unstable type?

There is no use of expression templates, but indeed multiple slices are 
created. This isn't as bad as it seems because the desire was to access 
several elements, so the slice is supposed to be around for long enough 
to justify its construction cost.

I agree it would be onerous to access a single element with e.g. 
matrix[1][1][2].

>> means: first, take a slice 2..5 that returns a matrix range one 
>> dimension smaller. Then, for that type take a slice from 0 to $. And 
>> so on.
>>
>> This works great for row-wise storage. I'm not sure how efficient it 
>> would be for other storage schemes.
> 
> No it doesn't. It works great for standard C arrays of arrays, but these 
> are not matrices and have a large number of well documented performance 
> issues when used as such. In general, multi-dimentional data structures 
> relatively common and should be cleanly supported.

[Citation needed]

>> Note how nice the distinction between the container and its views 
>> works: there is only one matrix. But there are many ranges and 
>> subranges within it, bearing various relationships with one another.
> 
> Yes, Data+View (i.e MVC for data structures) is a good thing(TM). But 
> generally, matrices have been views into data and not the data 
> themselves. (unless needed for memory management, etc)

Well if better terminology comes along I'm all for it. I want to define 
the "matrix storage" as a owning container, and several "matrix ranges" 
that access the data stored by the storage.


Andrei


More information about the Digitalmars-d-announce mailing list