[Issue 6788] New: std.range.pairwise?

d-bugmail at puremagic.com d-bugmail at puremagic.com
Fri Oct 7 16:19:58 PDT 2011


http://d.puremagic.com/issues/show_bug.cgi?id=6788

           Summary: std.range.pairwise?
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody at puremagic.com
        ReportedBy: bearophile_hugs at eml.cc


--- Comment #0 from bearophile_hugs at eml.cc 2011-10-07 16:19:10 PDT ---
This D2 program contains a very common iteration patten:


import std.stdio, std.typecons;
void main() {
  auto data = [1, 2, 3, 4];
  foreach (i, x1; data)
    foreach (x2; data[i+1 .. $])
      writeln(tuple(x1, x2)); // do something with x1 and x2
}


It prints (dmd 2.056head):

Tuple!(int,int)(1, 2)
Tuple!(int,int)(1, 3)
Tuple!(int,int)(1, 4)
Tuple!(int,int)(2, 3)
Tuple!(int,int)(2, 4)
Tuple!(int,int)(3, 4)


As you see it is quite different from "zip".

So I suggest to add a std.range.pairwise range that does the same thing (same
output):


import std.stdio, std.range;
void main() {
    auto data = [1, 2, 3, 4];
    foreach (tup; pairwise(data))
        writeln(tup);
}


This coding pattern is very common. It happens when you have a sequence of
items, and you want to work (apply a function) on all the pairs, and your
diadic function (function with two arguments) doesn't care for its arguments
order (symmetric function).

Often O(n ** 2) algorithms have to see all the pairs. But often the diadic
function is symmetric. So you need to scan only half of the matrix, an upper
triangle, or lower one. This is where pairwise is useful for. It's not a
necessary range (even zip is not necessary in D), because two loops are enough
to replace it, but it helps a bit because it avoids mistakes with the array
indexes. I have done similar mistakes more than once in past.

pairwise(sequence) is mostly meant to be used with a random-access input
sequence. It's not hard to make it work with forward ranges too, but it becomes
less efficient.


In Python I use a similar generator, defined just like this:


from itertools import combinations
def pairwise(seq):
    return combinations(seq, 2)


Usage examples:

>>> list( pairwise([]) )
[]
>>> list(pairwise([1,2]))
[(1, 2)]
>>> list( pairwise("abc") )
[('a', 'b'), ('a', 'c'), ('b', 'c')]
>>> list(pairwise([0, 1, 2, 3]))
[(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
>>> list(pairwise(range(5)))
[(0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3,
4)]


I expect Phobos to eventually have two very useful lazy ranges that generate
permutations and combinations. But pairwise is supposed to be very efficient,
here I'd like the compiler to generate code similar to two loops inside each
other, like in the first code example. This is probably hard to do if pairwise
is implemented with a call to a combinations() function.


Why is pairwise just about pairs? Isn't a more general solution better?


[20:33]    bearophile    A general function that takes n ranges and yields all
possible unsorted n-tuples is possible, but in practice this is a very uncommon
thing in my code. This is why I have limited it to pairs. It is just about
pairs because this is the most common case. 

A potential problem: some users might wrongly think pairwise(sequence) is the
same as std.range.chunks(sequence,2). The documentation has to specify they are
two quite different things, to avoid some of such mistakes.


This is an alternative design, similar to how lockstep works:

import std.stdio, std.range;
void main() {
    auto data = [1, 2, 3, 4];
    foreach (x1, x2; pairwise(data))
        writeln(x1, " ", x2);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list