Phobos2: iota, ranges, foreach and more

Mon May 4 13:40:56 PDT 2009

I am learning a bit more about ranges. They aren't hard to use for simple things. Now I think I understand about 10% of this topic.

------------------

Can you tell me what is the name of the Phobos2 functioid that given a lazy argument, iterates on it and returns the eager array of all its items? (it's named array() in my libs and list() in Python).

------------------

Now I think iota() with just one argument is useful. When just one argument is specified, then it's meant as the stop value, and the start value is meant to be 0. This is handy, and it's the one used by range/xrange of Python.

------------------

While reading the source code of Phobos2 function calls like ".front" confuse me, I find ".front()" more readable. I think I am not the only one to think like this.

------------------

Into the source code of filter of algorithms I have seen something like:

ref SomeName opSlice() { return this; }

Recently Andrei has exmplained me:

>People said, that sucks! With opApply I don't need to append the .all thingie! So I told Walter to automatically call [] against the container. So people can now write: [...] All the container has to to is define opSlice() to return its "all" range.<

But if I comment out such ref opSlice() the Filter keeps working. What's the purpose of that operator inside Filter?

(And I have to learn still about that usage of ref of the return).

------------------

Regarding the "popFront()" name, isn't a name like "next()" (for forward iteration) enough and nicer? (But see below too).

------------------

empty(), front(), etc are operators of struct/classes. So why not use the standard name syntax for D operators?

popFront() => opNext()
empty() ==> opEmpty
front() ==> opFront()
etc.

------------------

Is Phobos2 missing the groupby() of my dlibs (and Python itertools) still?
I think this is a very important iterator. It simplifies lot of situations, because it represents a quite common idiom.

In a few days I may become able to write it myself for D2 too.

------------------

D2 ranges don't yet support the index (and other values, there can be more than two!), as you can use with opApply.

When the index is a simple integer, Python2.6 solves this simple indexing problem with the built-in "enumerate()" iterable that yields pairs (the starting point defaults to 0):

>>> s = "hello"
>>> for i, c in enumerate(s):
...   print i, c
...
0 h
1 e
2 l
3 l
4 o
>>> for i, c in enumerate(s, 10):
...   print i, c
...
10 h
11 e
12 l
13 l
14 o

I don't see enumerate() yet in Phobos2, it may often be a cleaner solution.

To solve the more general problem in D2, we can think about having more than one opFront():

opFront(), opFront2(), opFront3(), etc, that return Tuple!() with 1, 2, 3 arguments.

And then the machinery of foreach can upack them:

int opFront() { return this.current; }
Tuple!(int, int) opFront2() { return tuple(count, this.current); }
Tuple!(int, int, int) opFront3() { return tuple(count, 10, this.current); }

foreach(x; Iter()) => calls opFront()
foreach(x,y; Iter()) => calls opFront2()
foreach(x,y,z; Iter()) => calls opFront3()    
etc.

Then the front-end must guarantee to simplify away those temporary structs to remove the overhead.

You can't have two different 2-len structs, because overload on the return value doesn't exists, so the following (doable with opApply) may be impossible with ranges, but I think this is an acceptable limitation:

foreach(float x; Iter()) => calls one opFront()
foreach(int y; Iter()) => calls another opFront()

------------------

After thinking about it I am unable to find a better design: I now think that your splitter() has to work like the xsplitter() of my dlibs (mine is specialized for strings, and this is good, because you often use it for them), that is like the str.split() method of Python strings, but lazy.

The current design of splitter of Phobos2 is not good enough (beside the fact that it splits in a non intuitive way, but I think you have already fixed this in the next version of dmd). I'll probably ask again for this in future if you don't listen to this now, because as groupby() a splitter() is a very widely used operation.
See also the rsplit().

You can find the semantics of split/rsplit here:
http://docs.python.org/library/stdtypes.html#string-methods

Playing Five minutes with the Python shell (or with the xsplit of my dlibs) may help you copy the semantics correctly (I use the word "copy" because copying that semantics, for a lazy range, is the best thing to do here).

------------------

I have done some simple experiments with ranges, I'll write about them in a future post.

Bye,
bearophile