Why we need opApply (Was: Can we drop static struct initializers?)

Sun Nov 22 17:12:18 PST 2009

dsimcha wrote:
> == Quote from Yigal Chripun (yigal100 at gmail.com)'s article
>> dsimcha wrote:
>>> == Quote from Max Samukha (spambox at d-coding.com)'s article
>>>> On Sat, 21 Nov 2009 18:51:40 +0000 (UTC), dsimcha <dsimcha at yahoo.com>
>>>> wrote:
>>>>> == Quote from Max Samukha (spambox at d-coding.com)'s article
>>>>>> On Fri, 20 Nov 2009 15:30:48 -0800, Walter Bright
>>>>>> <newshound1 at digitalmars.com> wrote:
>>>>>>> Yigal Chripun wrote:
>>>>>>>> what about foreach_reverse ?
>>>>>>> No love for foreach_reverse? <tear>
>>>>>> And no mercy for opApply
>>>>> opApply **must** be kept!!!!  It's how my parallel foreach loop works.  This
> would
>>>>> be **impossible** to implement with ranges.  If opApply is removed now, I will
>>>>> fork the language over it.
>>>> I guess it is possible:
>>>> uint[] numbers = new uint[1_000];
>>>> pool.parallel_each!((size_t i){
>>>>         numbers[i] = i;
>>>>     })(iota(0, numbers.length));
>>>> Though I agree it's not as cute but it is faster since the delegate is
>>>> called directly. Or did I miss something?
>>> I'm sorry, but I put a lot of work into getting parallel foreach working, and I
>>> also have a few other pieces of code that depend on opApply and could not (easily)
>>> be rewritten in terms of ranges.  I feel very strongly that opApply and ranges
>>> accomplish different enough goals that they should both be kept.
>>>
>>> opApply is good when you **just** want to define foreach syntax and nothing else,
>>> with maximum flexibility as to how the foreach syntax is implemented.  Ranges are
>>> good when you want to solve a superset of this problem and define iteration over
>>> your object more generally, giving up some flexibility as to how this iteration
>>> will be implemented.
>>>
>>> Furthermore, ranges don't allow for overloading based on the iteration type.  For
>>> example, you can't do this with ranges:
>>>
>>> foreach(char[] line; file) {}  // Recycles buffer.
>>> foreach(string line; file) {}  // Doesn't recycle buffer.
>>>
>>> They also don't allow iterating over more than one variable, like:
>>> foreach(var1, var2, var3; myObject) {}
>>>
>>> Contrary to popular belief, opApply doesn't even have to be slow.  Ranges can be
>>> as slow as or slower than opApply if at least one of the three functions (front,
>>> popFront, empty) is not inlined.   This actually happens in practice.  For
>>> example, based on reading disassemblies and the code to inline.c, neither front()
>>> nor popFront() in std.range.Take is ever inlined.  If the range functions are
>>> virtual, none of them will be inlined.
>>>
>>> Just as importantly, I've confirmed by reading the disassembly that LDC is capable
>>> of inlining the loop body of opApply at optimization levels >= O3.  If D becomes
>>> mainstream, D2 will eventually also be implemented on a compiler that's smart
>>> enough to do stuff like this.  To remove opApply for performance reasons would be
>>> to let the capabilities of DMD's current optimizer influence long-term design
>>> decisions.
>>>
>>> If anyone sees any harm in keeping opApply other than a slightly larger language
>>> spec, please let me know.  Despite its having been superseded by ranges for a
>>> subset of use cases (and this subset, I will acknowledge, is better handled by
>>> ranges), I actually think the flexibility it gives in terms of how foreach can be
>>> implemented makes it one of D's best features.
>> There are three types of iteration: internal to the container, external
>> by index, pointer, range, etc, and a third design with co-routines
>> (fibers) in which the container internally iterates itself and yields a
>> single item on each call.
>> Ranges accomplish only the external type of iteration. opApply allows
>> for internal iteration. All three strategies have their uses and should
>> be allowed in D.
> 
> Exactly.  I've said this before, but I think you said it much better.  Now that
> Walter has agreed to keep opApply, this should really be explained somewhere in
> TDPL and in the online docs to clarify to newcomers why someone would choose
> opApply over ranges or vice-versa.  External iteration is more flexible for the
> user of the object, internal iteration is more flexible for the designer of the
> object.

Copied this message to my todo list, thanks.

Andrei