Iterating over multiple collections in parallel

Steven Schveighoffer schveiguy at yahoo.com
Thu Jul 3 10:27:18 PDT 2008


"Koroskin Denis" wrote
> If found myself solving the same problem again and again:
> how to implement simultaneous iteration over two (or more collections)?
>
> suppose, we have the arrays:
> int[] a = [1, 2, 3];
> int[] b = [1, 4, 9];
>
> and would like to multiply them per-component, like this:
>
> int[] c = new int[a.length];
> for (int i = 0; i < a.length; ++j) {
>     c[i] = a[i] * b[i];
> }
>
> Since foreach is the prefered (and sometimes the only) way to iterate over 
> collection, there should be a way to iterate over multiple collections 
> simultaneously, like (just an example) this:
>
> int[] c = new int[a.length];
> foreach (i : a; j : b; ref k : c) {
>     k = a + b;
> }
>
> But this isn't supported (yet?)
> More complicated example would be an iterating over two (or more) 
> collection with *no* random access iterators available, opApply only.
>
> I spend a lot of time on this and have found no solution how to do it with 
> existing D feature set, but this is surely doable with a few 
> inter-function gotos and an exception if collections sizes don't match. 
> It's just a small layer written in asm, nothing more.
>
> How about an enhancement proposal? :)

The issue here is how to formulate each iteration.  foreach works because it 
is well defined "iterate over each value in the container".  But iterating 
over multiple containers, you could want different orders of iteration, 
iterating for each unique set of indexes, etc.

Or you could want what you had originally stated, in which cases the lengths 
must be correct.  We already have foreach_reverse, which is like foreach, 
but iterates in the opposite direction.  I don't want foreach_X for 
everyones preferred iteration method of choice.

I think the best solution here is a fruct (foreach struct) in a library:

struct myfruct
{
   static myfruct opCall(int[] a, int[] b, int[] c)
   {
      myfruct result;
      result.a = a;
      result.b = b;
      result.c = c;
      assert(a.length == b.length && a.length == c.length);
   }

   int[] a;
   int[] b;
   int[] c;
   int opApply(int delegate(ref int i, ref int j, ref int k) dg)
   {
      int result = 0;
      for(int i = 0; i < a.length; i++)
      {
         if((result = dg(a[i], b[i], c[i])) != 0)
             break;
      }
      return result;
   }
}

usage:

foreach(i, j, ref k; myfruct(a, b, c))
{
    k = i * j;
}

Now, the issue here is, how can this be made generic?  The answer is, we 
need iterators.  Otherwise you cannot iterate over multiple containers that 
are not index-based (such as AA's or maps).  With iterators, that have a 
well defined interface (like opApply), that allows one to increment 
individual indexes, and not have to know what indexes are made of how how 
they work.  They just define opInc and opStar.  Oh, and we would need 
reference return types too (so opStar can return an actual reference).

-Steve 





More information about the Digitalmars-d mailing list