opIndex() may hide opSlice()

Fri Mar 10 12:36:35 PST 2017

On Friday, March 10, 2017 10:43:43 H. S. Teoh via Digitalmars-d wrote:
> On Fri, Mar 10, 2017 at 07:41:31AM -0800, Jonathan M Davis via 
Digitalmars-d wrote:
> > On Friday, March 10, 2017 14:15:45 Nick Treleaven via Digitalmars-d 
wrote:
> > > On Friday, 10 March 2017 at 01:10:21 UTC, H. S. Teoh wrote:
> [...]
>
> > > > Using opSlice() for slicing (i.e., arr[]) is old,
> > > > backward-compatible behaviour.
> > >
> > > This seems non-intuitive to me (at least for single dimension
> > > containers) - when you see var[], do you think var is being
> > > indexed or do you think var is being sliced like an array
> > > (equivalent to var[0..$])?
> >
> > Yeah, I've never understood how it made any sense for opIndex to be
> > used for slicing, and I've never used it that way.
>
> It's very simple, really.  Under the old behaviour, you have:
>
>   arr[]       --->    arr.opSlice()
>   arr[x]      --->    arr.opIndex(x)
>   arr[x..y]   --->    arr.opSlice(x,y)
>
> This made implementing higher-dimensional slicing operators hard to
> define, especially if you want mixed slicing and indexing (aka
> subdimensional slicing):
>
>   arr[x, y]   --->    arr.opIndex(x, y)
>   arr[x, y..x]    --->    ?
>   arr[x..y, z]    --->    ?
>   arr[w..x, y..z] --->    arr.opSlice(w, x, y, z)  // ?
>
> Kenji's insight was that we can solve this problem by homogenizing
> opSlice and opIndex, such that [] *always* translates to opIndex, and ..
> always translates to opSlice.
>
> So, under the new behaviour:
>
>   arr[]       --->    arr.opIndex()
>   arr[x]      --->    arr.opIndex(x)
>   arr[x,y]    --->    arr.opIndex(x,y)
>
>   arr[x..y]   --->    arr.opIndex(arr.opSlice(x,y))
>   arr[x, y..z]    --->    arr.opIndex(x, arr.opSlice(y,z))
>   arr[x..y, z]    --->    arr.opIndex(arr.opSlice(x,y), z)
>
> This allows mixed-indexing / subdimensional slicing to consistently use
> opIndex, with opSlice returning objects representing index ranges, so
> that in a multidimensional user type, you could unify all the cases
> under a single definition of opIndex:
>
>   IndexRange opSlice(int x, int y) { ... }
>
>   auto opIndex(I...)(I indices)
>   {
>       foreach (idx; indices) {
>           static if (is(typeof(idx) == IndexRange))
>           {
>               // this index is a slice
>           }
>           else
>           {
>               // this index is a single index
>           }
>       }
>   }
>
> Without this unification, you'd have to implement 2^n different
> overloads of opIndex / opSlice in order to handle all cases of
> subdimensional slicing in n dimensions.
>
> So you can think of it simply as:
>
>   []  ==  opIndex
>   ..  ==  opSlice
>
> in all cases.
>
> It is more uniform this way, and makes perfect sense to me.

Well, thanks for the explanation, but I'm sure that part of the problem here
is that an operation like arr[x, y..z] doesn't even make sense to me. I have
no idea what that does. But I don't normally do anything with
multidimensional arrays, and in the rare case that I do, I certainly don't
need to overload anything for them. I just slap together a multidimensional
array of whatever type it is I want in a multidimensional array. I can
certainly understand that there are folks who really do care about this
stuff, but it's completely outside of what I deal with, and for anything
I've ever dealt with, making opIndex be for _slicing_ makes no sense
whatsoever, and the added functionality to the language with regards to
multi-dimensional arrays is useless. So, this whole mess has always felt
like I've had something nonsensical thrown at me because of a use case that
I don't even properly understand.

> > I generally forget that that change was even made precisely because it
> > makes no sense to me, whereas using opSlice for slicing makes perfect
> > sense. I always use opIndex for indexing and opSlice for slicing just
> > like they were originally designed.
>
> [...]
>
> This is probably why Kenji didn't deprecate the original use of opSlice,
> since for the 1-dimensional case the homogenization of opSlice / opIndex
> is probably unnecessary and adds extra work for the programmer: if you
> want to implement arr[x..y] you have to write both opSlice and an
> opIndex overload that accepts what opSlice returns, as opposed to just
> writing a single opSlice.
>
> So probably we should leave it the way it is (and perhaps clarify that
> in the spec), as deprecating the "old" use of opSlice in the
> 1-dimensional case would cause problems.

Well, I'd prefer that the original way be left, since that's all I've ever
needed. If the new way makes life easier for the scientific programmers and
whatnot, then great, but from the standpoint of anyone not trying to provide
multi-dimensional overloads, using opIndex for slicing is just plain
bizarre.

That being said, I'm fine with the compiler detecting if opIndex and opSlice
are declared in a way that they conflict and then giving an error. I just
don't want to be forced to use opIndex for slicing.

- Jonathan M Davis