C++ / Why Iterators Got It All Wrong

Sun Sep 3 07:19:19 PDT 2017

On Sunday, 3 September 2017 at 12:35:16 UTC, Moritz Maxeiner 
wrote:
> On Sunday, 3 September 2017 at 09:24:03 UTC, Ilya Yaroshenko 
> wrote:
>> On Sunday, 3 September 2017 at 02:43:51 UTC, Moritz Maxeiner 
>> wrote:
>>> On Sunday, 3 September 2017 at 02:08:20 UTC, Ilya Yaroshenko 
>>> wrote:
>>>> On Tuesday, 29 August 2017 at 12:50:08 UTC, Robert M. Münch 
>>>> wrote:
>>>>> Maybe of interest: 
>>>>> https://www.think-cell.com/en/career/talks/iterators/#1
>>>>>
>>>>> I haven't read everything, so not sure if it worth to take 
>>>>> a look.
>>>>
>>>> Iterators has no proper alternative when one need to 
>>>> implement generic tensor library like Mir Algorithm [1].
>>>
>>> Out of curiosity: Could you elaborate on what the issues are 
>>> with using a range based API internally (if it's performance, 
>>> the "why")?
>>
>> There are three general kinds of dynamic dense tensors. Mir 
>> implements all of them:
>>
>> 1. Contiguous tensors. Their data is located contiguously in 
>> memory. Single dense memory chunk. All strides between 
>> subs-tensors can be computed from lengths.
>>
>> 2. Canonical tensors. Only data for one dimension is dense, 
>> other dimensions has strides that can not be computed from 
>> lengths. BLAS matrixes are canonical tensors: they have two 
>> lengths and one stride.
>>
>> 3. Universal tensors. Each dimension has a stride. Numpy 
>> ndarrays are universal tensors.
>>
>> Finite random access range Issues:
>>
>> 1. Main API issue is that full featured random access range 
>> (RAR) API (with slicing, and all opIndex* operators like 
>> `[]*=` ) is much larger comparing with unbounded random access 
>> iterator (the same API as pointer).
>
> Won't you have to implement opIndex* operators either way in 
> order to use the `a[]` syntax? The main difference I can see 
> should be that you'd also have to implement the InputRange 
> (front,popFront,empty), ForwardRange (save), and 
> BidirectionalRange (back,popBack) members, which if you don't 
> need them, is indeed understandably too much?

front,
popFront,
empty,
save,
back,
popBack,
opIndexOpAssign,
2 x opIndex(.opSlice) for rhs scalars and ranges
2 x opIndexAssign(.opSlice) for rhs scalars and ranges
2 x opIndexOpAssign(.opSlice) for rhs scalars and ranges

Plus, because of range topology slicing may be slower good to add 
this ones:

popFrontN
popFrontExactly,
popBackN
popBackExactly,

and maybe i forgot something ...

>>
>> 2. Random access range holds its length. Yes, Phobos has 
>> infinity ranges (mir hasn't), but the practice is that almost 
>> any RAR constructed using Phobos has length and you can not 
>> strip when type is constructed. Anyway, infinity RAR is just a 
>> pointer/iterator that can move forward but can not move 
>> backward. Tensors hold their own lengths for each dimensions, 
>> so range's length is just a useless payload.
>
> I'm not sure what you mean here. Is this still about accessing 
> the elements of one tensor? If so, what do Phobos' ranges have 
> to do with your tensor type's API?

ndslice is:
1. Slice structure with a huge amount of features from mir 
algorithm
2. multidimensional RAR: .front!0, front!1, .front!(N - 1) and 
all other generalized RAR primitves including multidimensional 
index [1, 2, 3, ...] and slicing [1 .. 4, 3 .. $].
3. Common one dimensional RAR on top of outer most dimension. So 
one can use Phobos funcitons for Slice structure. For example, a 
matrix will be iterated row-by-row.

But this paragraph was about another issue:

struct BLASMatrixTypeBasedOnRanges
{
    double[] _data; // arrays is the simplest RAR
    size_t rows, cols;
    size_t stride;
}

sizeof(YourBLASMatrixTypeBasedOnRanges) == size_t.sizeof * 5;

in other hand:

sizeof(Slice!(Canonical, [2], double*)) == size_t.sizeof * 4;

Ranges requires for 25% more space for canonical matrixes then 
iterators.

>>
>> 3. Ranges can not move backward (auto b = a[-4 .. $].). This 
>> means one can not implement dynamic `reversed`(flip) [1, 2] 
>> operation for dimensions that have strides (universal or 
>> canonical tensors). _Dynamic_ means that instead of construct 
>> a new type with reversed order for a dimension, we just negate 
>> the corresponding stride and move the cursor.
>
> Right, that is indeed a limitation of the range abstraction 
> (though a type could expose both functionality to move 
> backwards _and_ the range API) and if you need that, it makes 
> sense not to use ranges.
>
> Thanks for the summary of issues :)