[Issue 6536] New: "in" operator for inclusivity in array index range

d-bugmail at puremagic.com d-bugmail at puremagic.com
Sat Aug 20 03:51:13 PDT 2011


http://d.puremagic.com/issues/show_bug.cgi?id=6536

           Summary: "in" operator for inclusivity in array index range
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody at puremagic.com
        ReportedBy: bearophile_hugs at eml.cc


--- Comment #0 from bearophile_hugs at eml.cc 2011-08-20 03:51:05 PDT ---
This is a low-priority enhancement request. What's discussed here is not
exceptionally useful, it's just a small convenience.

In Python programs a quite common operation is to verify an item is present
inside a short list:

>>> L = [3, 7, 11]
>>> x = 7
>>> x in L
True

In Bugzilla there is an old enhancement request that asks for something similar
with the "in" operator on the built-in D arrays. So far it was not accepted,
despite its usefulness because:
- In associative arrays "in" searches on keys (indexes), while on arrays it has
to search on values. This difference in semantics is perceived as a problem
(despite it is not a problem in Python).
- "in" is meant to be a O(ln n) operator or faster.

One proposal was to accept "in" operator only on arrays with a length known at
compile-time, usually short arrays (like in the Python example above). This is
still a linear search, but it's bounded. This is a rather useful operation.

Here I propose something different. To define the "in" operator for the
built-in arrays to search on the indexes. The "in" operator is too much good to
waste. This search has the same semantics as the "in" for associative arrays,
and it's always O(1). Despite this is not nearly as useful as searching among
the array contents, it's useful still, there are some use cases.

A simple use case is for pre-conditions, like:


void foo(int[] array, int x)
in {
    assert(x >= 0 && x < array.length);
} body {}


That assert becomes:

assert(x in array);

This is simpler to read, shorter, and less bug prone. The less bug-prone aspect
of this syntax is not of secondary importance, because such compound tests are
often source of small bugs.

A problem of this syntax is that Python (and other) programmers expect "x in
array" to search among the array values. This requires a bit of mental
retraining.


It's useful for loop invariants:

foreach (i; 0 .. N) {
    // code here
    const aux_index = some expression;
    assert(aux_index >= 0 && aux_index < array.length);
}


It becomes (this also avoids the need of a temporary variable that becomes
useless in release mode builds):


foreach (i; 0 .. N) {
    // code here
    assert((some expression) in array);
}


Another use case is for operations done on neighbours of every item of an array
(here a 2D array):


bool hasGoodNeighbours(int[][] table, int row, int col) {
    foreach (rs; -1 .. 2)
        foreach (cs; -1 .. 2)
            if ((r + rs) >= 0 && (r + rs) < table.length &&
                (c + cs) >= 0 && (c + cs) < table[0].length &&
                isBad(table[r + rs][c + cs])
        return false;
    return true;
}


(This code assumes all the rows have the same length, here I have omitted the
pre-condition that verifies this).

The rs and cs allow to scan the 3x3 neighbourhood of every table item. But the
code needs tests to avoid going outside the matrix for the cells on the frame
of the matrix.


With the proposed idea the code becomes shorter and more readable:


bool hasGoodNeighbours(int[][] table, int row, int col) {
    foreach (rs; -1 .. 2)
        foreach (cs; -1 .. 2)
            if ((r + rs) in table && (c + cs) in table[0] &&
                isBad(table[r + rs][c + cs])
        return false;
    return true;
}


I have had to use "in table[0]" because this proposed idea is just for single
arrays.


In a higher level language you specify where you are searching for, in domain
or codomain. So given a matrix A, you write (this is just a made up syntax, but
something similar is possible in Chapel language, where A is a generic
collection):

xy in A.domain
x in A.codomain

In the first case xy is a (x, y) tuple, that is verified to be a valid index
for the matrix. In the second case the x is an item that is verified to be
contained inside the 2D matrix. This is short and not semantically ambiguous.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list