confusing (buggy?) closure behaviour

Fri Dec 12 12:21:40 PST 2008

Denis Koroskin Wrote:

> On Fri, 12 Dec 2008 22:18:28 +0300, Zoran Isailovski  
> <dmd.zoc at spamgourmet.com> wrote:
> 
> > Denis Koroskin Wrote:
> >
> >> On Fri, 12 Dec 2008 19:32:03 +0300, Zoran Isailovski
> >> <dmd.zoc at spamgourmet.com> wrote:
> >>
> >> > I'm an experienced C#, Java and Python programmer, and have employed
> >> > closures (and C# delegates) upon numerous occasions. While  
> >> experimenting
> >> > with D closures and delegates, I was stroke by a phenomenon I cannot
> >> > explain. Here's the code:
> >> >
> >> > module closures01;
> >> >
> >> > import std.stdio;
> >> >
> >> > alias int delegate(int arg) Handler;
> >> >
> >> > Handler incBy(int n)
> >> > {
> >> > 	return delegate(int arg){ return arg + n; };
> >> > }
> >> >
> >> > Handler mulBy(int n)
> >> > {
> >> > 	return delegate(int arg){ return arg * n; };
> >> > }
> >> >
> >> > void test1()
> >> > {
> >> > 	writefln("\ntest1:\n----------------------------------------");
> >> > 	int x = 10, y;
> >> > 	y = mulBy(3)(x); writefln("%d * 3 -> %d", x, y);
> >> > 	y = mulBy(4)(x); writefln("%d * 4 -> %d", x, y);
> >> > 	y = incBy(2)(x); writefln("%d + 2 -> %d", x, y);
> >> > }
> >> >
> >> > void test2()
> >> > {
> >> > 	writefln("\ntest2:\n----------------------------------------");
> >> > 	int x = 10, y;
> >> > 	Handler times3 = mulBy(3);
> >> > 	Handler times4 = mulBy(4);
> >> > 	Handler plus2 = incBy(2);
> >> > 	y = times3(x); writefln("%d * 3 -> %d", x, y);
> >> > 	y = times4(x); writefln("%d * 4 -> %d", x, y);
> >> > 	y = plus2(x); writefln("%d + 2 -> %d", x, y);
> >> > }
> >> >
> >> > public void run()
> >> > {
> >> > 	test1();
> >> > 	test2();
> >> > }
> >> >
> >> > /* **************************************** *
> >> >  * Compiled with: Digital Mars D Compiler v1.030
> >> >  *
> >> >  * (Unexplainable) program output:
> >> > test1:
> >> > ----------------------------------------
> >> > 10 * 3 -> 30
> >> > 10 * 4 -> 40
> >> > 10 + 2 -> 12
> >> >
> >> > test2:
> >> > ----------------------------------------
> >> > 10 * 3 -> 20
> >> > 10 * 4 -> 42846880
> >> > 10 + 2 -> 4284698
> >> >
> >> > * **************************************** */
> >> >
> >> > What goes wrong???
> >>
> >> I'd say that it works as expected and here is why.
> >>
> >> First of all, there are two types of closures:  static and dynamic
> >> closures.
> >> Closures work by having a hidden pointer to function frame where all  
> >> local
> >> variables are stored.
> >>
> >> When a static closure is created, all the function local variables are
> >> stored on stack.
> >> It has an advantage that no memory allocation takes place (fast).
> >> It has a disadvantage that once the delegate leaves the scope, it  
> >> becomes
> >> invalid since variables were stored on stack and the stack is probably
> >> overwritten (unsafe).
> >>
> >> Dynamic closure allocates memory in a heap and all the local variables  
> >> are
> >> placed there.
> >> It has a disadvantage that memory is allocated for dynamic closure  
> >> (might
> >> be slow if dynamic closure are created often).
> >> It has an advantage that dynamic closure may leave the scope, i.e. you  
> >> may
> >> save it and call whenever you want.
> >>
> >> D1 support static closures only! That's why your code doesn't work (in
> >> test1 stack is still valid, but in test2 stack gets overwritten)
> >> D2 has support for dynamic closures. Just try it - your sample works as  
> >> is.
> >
> > Thx, Denis, but I'm still confused. The stack thing was also my first  
> > thought. But when I tried to actually explain the dynamics that way, I  
> > came to the conclusion that then, test1() shouldn't have worked either.
> >
> > I assumed the following (schematic) process takes place:
> >
> > <code>
> > mulBy(3)(x)
> > => push 3; call mullBy;
> > // upon entry into mulBy: Stack = [ >&mulBy, 3, ... ]
> > // I assume, the callee cleans up the stack, so...
> > // upon return from mulBy: Stack = [ &mulBy, 3, >... ]; cpu_register =  
> > &delegate
> > => push x; call [cpu_register]
> > // upon entry into delegate: Stack = [ >&delegate, x, ... ]
> > </code>
> >
> > (Here, the stack / frame pointer is denoted by ">", and moves from right  
> > to left on push, left to right on pop)
> >
> > With that mechanism, when the delegate is entered, the memory where  
> > previously the number 3 was stored, should have been overwritten by x.
> >
> > But it obviously isn't ?!?
> >
> > How come?
> 
> No, you are taking it slightly wrong.
> The delegate stores a raw /pointer to stack frame/ so it doesn't depend on  
> current stack head (ESP).

And that's exactly what I was assuming, Denis. But function calls still use the stack to pass parameters and store the return address, don't they?

So if the way stacks, the stack pointer (ESP) and the frame pointer (EBP) work hasn't dramatically changed since I last had business with them, then a "snapshot" of the situation just before returning from mulBy should look like this:

         lo --- stack memory --- hi
stack: ..., retadr, 3, ...
ESP:       ^
rawptr to n:       ^

"rawptr" is you raw pointer to the stack frame (the memory block on the stack visible inside mulBy, i.e., basically the value of EBP inside mulBy).

Indeed, it is not bound to ESP or EBP. Upon return from mulBy, the situation should look like this

         lo --- stack memory --- hi
stack: ..., retadr, 3, ...
ESP:                     ^
rawptr to n:       ^

rawptr has not changed.

Then, upon call to the delegate, x and the return value are pushed onto the stack. This changes ESP *and* the contents of the stack memory, probably resulting into somthing like

         lo --- stack memory --- hi
stack: ..., retadr, x, ...
ESP:        ^
rawptr:              ^

rawptr still points to the same memory, but the memory has been changed. It now contains the value of x, not 3.

This, at least, used to be how things worked at the machine level.

Don't get me wrong. I'm aware that the above "theory" does not explain the behavior in my sample code. I'd just like to know why, that's all.