[Issue 4264] Support opApply in std.algorithm, std.range where possible

Sun Aug 22 07:39:49 PDT 2010

http://d.puremagic.com/issues/show_bug.cgi?id=4264

--- Comment #8 from bearophile_hugs at eml.cc 2010-08-22 07:39:44 PDT ---
Thank you for your answer David Simcha.

> Since everything you mention except the opApply issue is fixed either
> in 2.048 or in SVN,

Yes, indeed now all tests but the one that uses map() work.

I agree with more or less all your points here. And I have no final answer for
your doubts. Here are few comments.

> Even if these get fixed, supporting ref correctly would be non-trivial
> enough that I don't want to implement everything w/o ref support and
> then have to go back later and add ref support.

I see. Working and fixing the std lib is a lot of work, so I suggest to keep
Phobos as short (as in number of lines) as possible, so when bugs gets fixed
you don't need a week to update it.

A possible good solution is to wait and not support opApply until those bugs
are fixed, to reduce your work. Sometimes the typical laziness of the
programmer is positive.

> 2.  If we start building higher order iterables out of opApply, things could
> become slow as molasses through multiple layers of delegate calls.  I know LDC
> can inline these at least in simple cases, but what about when you have N > 1
> levels of them?

LDC can currently inline one level of opApply, but probably not much more than
that. Performance may indeed be reduced (even if we assume LLVM will get a bit
better with time).

On the other hand:
- D is not a functional language, D programmers don't write
map(filter(chain(map(... all the time. They usually use no more than one or two
level of lazy chaining.
- Several operations (like zip) are not possible with opApply, so this reduces
the places where they may be abused and where they can slow down code.
- In a normal program a good percentage of its code doesn't need to be
high-performance, for example you need to covert to numbers few (no more than
5) strings coming from args[] into numbers, and you use a
map!(to!(int))(args[1..$]). Many (but not all) of the usages of std.algorithm
are to shorten code, to make it more readable or less buggy, but they are not
used where max performance is necessary. This means that in practice a map()
performed on a struct that defines opApply is often not a performance problem.
(But programmers will need to be careful).

> 3.  This is a big enough change to the way std.range/std.algorithm work that I
> think it needs to be discussed first, rather than just being checked in one day
> like a bug fix or minor enhancement.  Maybe you should bring it up on the
> newsgroup.

I will write a small post about this in the main D newsgroup soon then.

I don't have a final answer for your doubts. Generally I am not for pure
solutions, I think that ranges are good, but life is complex, and if you try to
cram everything into a single abstraction, you will have problems.

If std.algorithm doesn't support opApply at all, then opApply itself becomes
much less useful. In D1 I have seen that you can implement most of
std.algorithm using opApply only.

Supporting opApply in Phobos may be a lot of work, but it's work done once.
Then all D programs are able to use this hard work. This is why I like std libs
as flexible as possible, to make them support both fixed-sized arrays and
opApply, because this later makes user code simpler and more uniform, and this
Python shows that this usage uniformity is _very_ important to make the
language (its std lib) handy and quick to use.

Another thing about ranges is more general. opApply and ranges may be used for
the same purposes, like to create a lazy generator of items, or to offer a way
to scan the leaves of a tree through a simple foreach. But if you try to write
a tree visit using opApply or using the Range protocol you see that they are
not the same thing.

You may see other examples here:
http://code.activestate.com/recipes/221457/

For example (Python 2.x code):

# Inverse Gray code
def A006068():
    yield 0
    for x in A006068():
        if x & 1:
            yield 2*x+1
            yield 2*x
        else:
            if x:
                yield 2*x
            yield 2*x+1

It's not too much hard to implement this using opApply, the code is similar to
that Python code, just much more noisy. But if you use the Range protocol you
need to manage the state manually.

With this I am trying to say that the Range protocol doesn't replace opApply,
there are algorithms more naturally implemented with opApply. If you try to
cram everything into the Range idea, you risk losing something.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------