The magic behind foreach (was: Re: Descent 0.5.3 released)

Wed Jan 21 22:50:07 PST 2009

On Thu, 22 Jan 2009 09:30:15 +0300, Ary Borenszweig <ary at esperanto.org.ar> wrote:

> Ary Borenszweig wrote:
>  > BCS wrote:
>  >> Reply to Robert,
>  >>
>  >>> That doesn't look entirely useless, especially for optimization.
>  >>> Perhaps hard to read, but easier than reading the assembly output  
> ;-P!
>  >>>
>  >>
>  >> ditto; now that you have it might as well make it available.
>  >
>  > Ok, I'll work on it. :-)
>
> I still have to work on some stuff, but...
>
> Before:
> ---
> module main;
>
> import std.stdio;
>
> class Foo {
> 	uint array[2];
>
> 	int opApply(int delegate(ref uint) dg) {
> 		int result = 0;
>
> 		for(int i = 0; i < array.length; i++) {
> 			result = dg(array[i]);
> 			if(result)
> 				break;
> 		}
> 		return result;
> 	}
> }
>
> int main(char[][] args) {
> 	Foo foo = new Foo();
> 	foreach(x; foo) {
> 		if (x == 3) {
> 			break;
> 		}
> 		writefln("%s", x);
> 	}
> 	return 0;
> }
> ---
>
> After:
> ---
> module main;
>
> import object;
> import std.stdio;
>
> class Foo: Object {
> 	uint[2] array;
>
> 	int opApply(int delegate(ref uint) dg) {
> 		assert(this, "null this");
> 		{
> 			int result = 0;
> 			for(int i = 0; cast(uint) i < 2; i++) {
> 				result = dg(this.array[cast(uint) i]);
> 				if(result)
> 					break;
> 			}
> 			return result;
> 		}
> 	}
> }
>
> int main(char[][] args) {
> 	Foo foo = new Foo;
> 	foo.opApply(delegate (uint __applyArg0) {
> 		{
> 			{
> 				uint x = __applyArg0;
> 				if(x == 3)
> 					return 1;
> 				writefln("%s", x);
> 			}
> 			return 0;
> 		}
> 	} );
> 	return 0;
> }
> ---
>
> Ummm... I was wondering... In every implemetation of opApply, after you  
> invoke the delegate you must check the result to see if it's non zero,  
> right? In that case, you must break the iteration.
>
> If the compiler can transform a "foreach" into an opApply call, passing  
> the foreach body and converting breaks to "return 1" statements... can't  
> opApply be specified as:
>
> int opApply(void delegate(ref uint) dg) { // note: delegate returns void
> }
>
> and the compiler transforms the opApply signature to the one that's used  
> now, plus converting each dg call to a call and a check of return value  
>   != 0 and return 1 in that case?
>
> Yes, yes, this is not trivial at all, but it's possible. And then D  
> programmers no longer have to make ifs and return magic numbers to make  
> foreach work. (think: do once in the compiler, eliminate thousands of  
> boilerplate codes in programs)
>
> So basically:
>
> ---
> int opApply(void delegate(ref uint) dg) {
>    for(int i = 0; i < array.length; i++) {
>      dg(array[i]);
> }
> ---
>
> would be converted to:
>
>
> ---
> int opApply(void delegate(ref uint) dg) {
>    for(int i = 0; i < array.length; i++) {
>      result = dg(array[i]);
>      if (result) return 1;
> }
> ---
>
> What do you think?
>
> (By the way, why opApply returns an int? What's the use of that?)

A side question: does foreach loop allocate memory (for delegate) in D2? It certainly shouldn't, but escape analysis can't prove that it is safe to allocate closure on stack:

int delegate(ref int) _dg;

class Foo {
    int opApply(int delegate(ref int) dg) {
        _dg = dg; // dang!
    }
}

Perhaps, documentation should indicate that opApply mustn't store the delegate that is passed to it (it makes little sense anyway) and safely make foreach delegates static (if it is not done yet).