Proposal: Hide the int in opApply from the user

Mon Jan 7 01:06:52 PST 2008

I proposed this iniitally over in D.learn, but I'm cleaning it up and 
reposting here in hopes of getting some response from Walter who was 
probably too busy finishing const and eating holiday Turkey at the time 
to notice.  And rightly so.

I don't believe it's appropriate in a high-level supposedly clean 
language like D that one of the main facilities for iterating over user 
types (foreach) requires writing code that passes around magic values 
generated by the compiler (opApply).

It seems wrong to me that these magic values
- come from code generated by the compiler,
- must be handled exactly the proper way by the user's opApply
    (or else you get undefined behavior, but no compiler errors)
- and then are handed back to code also generated by the compiler.

Furthermore, the compiler-generated code in the first and last steps 
share the same scope!  So the solution seems obvious -- they should pass 
the information back and forth using a local variable in their shared 
local scope.

In this proposal we need to add two things:

1) a new template struct and
2) a new macro [yes, this proposal relies on macros which don't exist yet!]

The template just bundles an int* (pointer to _ret) together with the 
loop body delegate:

   struct Apply(Args...)
   {
     alias void delegate(Args) LoopBody;
     LoopBody _loop_body;
     int* _ret = null;
   }

the macro is this (just guessing what syntax will be, and hoping macros 
will support tuple-like varargs):

   macro yield(dg, args...) {
     dg._call(args);
     if (dg._ret && *dg._ret) { return; }
   }

With these two library additions, opApply functions can become this:

void opApply( Apply!(ref T) dg ) {
    for( /*T x in elements*/ ) {
        yield(dg,x);
    }
}

Now the trickiness is *all* shifted to how you call such a beast 
properly, which is all handled by the compiler.  For a foreach in a void 
function, the compiler will have to generate code like so:

     int _ret = 0;
     void _loop_body(/*ref*/ T x)
     {
         writefln("x is ", x);
         if (x=="two") { _ret = BREAK; return; }
         if (x=="three") { _ret = RETURN; return; }
         do_something;
     }
     obj.opApply( Apply!(T)(&_loop_body, &_ret) );
     if (_ret==RETURN) return;

The language can ALMOST do this today except for three small things:
1) No macros - but they're on the way!
2) Inability to preserve ref-ness of template arguments -- but I think 
this really needs to be solved one way or another regardless.
3) The necessary but changes to the foreach code gen -- this is 
straightforward.

Attached is a proof of concept demo.  I've manually inlined the yield() 
code to work around 1), and made the loop body use a non-ref type to 
work around 2).  I manually generated the foreach code too to deal with 3).

The great thing about this proposal is that it is backwards compatible. 
   foreach already generates different code depending on what the 
argument is, this can just be another case detected by the use of the 
Apply argument.  Code using old-style opApplys can continue to work.

The main thing fuzzy in my mind is the vague status of yield and Apply. 
  They don't need to be keywords per-se, but the compiler at least needs 
to know about Apply so that it can recognize the signature of this 
"new-style" opApply.   I think it can maybe satisfy all that by going 
into object.d?  If there were anonymous struct literals it wouldn't even 
need to be a real struct, just an alias like we have for 'string' now.

--bb
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: newforeach.d
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20080107/31d696d5/attachment.ksh>