Smart slicing

bearophile bearophileHUGS at lycos.com
Thu Mar 20 16:39:27 PDT 2008


Hello, I have discussed this topic a bit on the IRC #D channel too.

Array slicing is a very useful feature of D. In Python the array (list) slicing is even more powerful, because array indexes can be negative (they wrap around, so -1 equals to $-1 in D, but I like the D solution too, it's faster too), and it has a third optional parameter (stride) that can be a negative number too:

>>> "abcdefghilmn"[2]
'c'
>>> "abcdefghilmn"[2:11]
'cdefghilm'
>>> "abcdefghilmn"[2:]
'cdefghilmn'
>>> "abcdefghilmn"[:2]
'ab'
>>> "abcdefghilmn"[-1]
'n'
>>> "abcdefghilmn"[-2]
'm'
>>> "abcdefghilmn"[2:-2]
'cdefghil'
>>> "abcdefghilmn"[11:2]
''
>>> "abcdefghilmn"[11:2:-1]
'nmlihgfed'
>>> "abcdefghilmn"[11:2:-2]
'nlhfd'
>>> "abcdefghilmn"[2:30]
'cdefghilmn'
>>> "abcdefghilmn"[30:50]
''

(I presume D has used .. instead of : because : is used by AAs, and {} can't be used to denote AAs because it's used already in the C syntax).
I don't miss all those features in D (D arrays have .reverse that replaces the common case of stride=-1), but I miss the feature you can see in the last two examples, that is the overflowing index is put back to the actual max-min index bound of the array.

A bare-bone semantic like that can be implemented in D like this (no negative indexes, no optional stride. But it's easy to add a third overloaded function for the stride too):

T[] slice(T)(T[] array, int start) {
    int end = array.length;
    if (end <= start)
        return new T[0];
    if (start < 0)
        start = 0;
    return array[start .. end];
}

T[] slice(T)(T[] array, int start, int end) {
    if (end <= start)
        return new T[0];
    if (start < 0)
        start = 0;
    if (end > array.length)
        end = array.length;
    return array[start .. end];
}

later you can write:
"abcdefghilmn".slice(2,30)


Some questions:
1) What's the purpose of this smart slicing in the core of the language?
It avoids errors, and speeds up programming a little because you need less care in slicing bounds.
In Python and Ruby you use such feature often.
You can use it to take "50 characters or less" from a string (less if the string is shorter than 50), or to safely produce empty slices in some situations. I know you can do the same with D, for example:

somestring[0 .. $ < 50 ? $ : 50]
Or:
somestring[0 .. min(50, $)]

But it's less immediate, expecially when the 50 is a variable, so you need:

somestring[0 .. min(max(x, 0), $))]

Another small example, replace/create heads of sub-arrays with a given value (not in-place):


import d.func: list, putr, map;

T[][] replace_heads1(T)(T[][] seq, T x) { // Error: ArrayBoundsError
    T[] sub;
    return list([x]~sub[1..$], sub, seq);
}

T[][] replace_heads2(T)(T[][] seq, T x) {
    T[] sub;
    return list(sub.length ? [x]~sub[1..$] : [x], sub, seq);
}

T[][] replace_heads3(T)(T[][] seq, T x) { // smart slicing
    T[] sub;
    return list([x]~sub.slice(1), sub, seq);
}

void main() {
    int[][] l = [[1, 2, 3], [4, 5], [6], []];
    //putr(replace_heads1(l, 0));
    putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
    putr(replace_heads2(l, 0)); // Prints: [[0, 2, 3], [0, 5], [0], [0]]
}

(list() is like a Python list comphrension)

2) 'Smart slicing' can't be used on poiter-based arrays (C ones), because their length isn't known at runtime (but it can be done at compile time on static arrays). This just means you can't rely on smart slicing on such lower-level arrays.

3) Such smart slicing is slower, and I think at compile-time the compiler can't optimize it much. D is designed to be quite faster than Python/Ruby, so such slowdown may be too much.
C++ gives at() and [] to index a a vector, in a safe and unsafe way, so for D it may be invented a different syntax for safe slicing. I think those slice() functions may be enough (and you have to import them from the std lib), when the language will allow to use $ in user functions too. In the meantime the $ hask may be enough.

Bye,
bearophile



More information about the Digitalmars-d mailing list