A Small Contribution to Phobos

Sat Jun 1 22:58:24 PDT 2013

> For reference type ranges and input ranges which are not 
> forward ranges, this
> will consume the range and return nothing.

I originally wrote it to accept forward ranges and use save, but 
I wanted to make it as inclusive as possible. I guess I 
overlooked the case of ref ranges. As for ranges that aren't 
forward ranges, consider a simple input range.

struct InputRange
{
     int[] arr = [1, 2, 3, 4, 5];

     int front() { return arr.front; }

     bool empty() { return arr.empty; }

     void popFront() { return arr.popFront; }
}

writeln(isForwardRange!InputRange); //False

Range()
.each!(n => write(n, " "))
.map!(n => n * n)
.writeln;

This outputs 1 2 3 4 5 [1, 4, 9, 16, 25], so each is not 
returning an empty range. I believe this is because r in this 
case is a value type range, and the foreach loop makes a copy of 
it. This does still leave the problem of reference type ranges.

> Also, range-based
> functions should not be strict (i.e. not lazy) without good 
> reason. And I
> don't see much reason to make this strict.

It's not lazy because it's intended to perform some mutating or 
otherwise side-effectful operation. Map doesn't play well with 
side effects, partially because of its laziness. A very contrived 
example:

auto arr = [1, 2, 3, 4].map!(n => n.writeln); //Now what?

It's not clear now what to do with the result. You could try a 
for loop:

foreach (n; arr) n(); //Error: n cannot be of type void

But that doesn't work. A solution would be to modify the function 
you pass to map:

auto arr = [1, 2, 3, 4].map!((n) { n.writeln; return n; });

foreach (n; arr) {} //Prints 1 2 3 4

But that's both ugly and verbose. each also has the advantage of 
being able to return the original range (possibly modified), 
whereas map must return a MapResult due to its laziness, and you 
need that extra array call to bludgeon it into the correct form. 
each is also more efficient in that it doesn't need to return a 
copy of the data passed to it. It simply mutates it in-place.

> Also, it's almost the same thing as map. Why not just use map? 
> The predicate can simply return the same value
> after it's operated on it.

See above. There are some cases where map is clunky to work with 
due to it being non-strict.

> If we did add this, I'd argue that transform is a better name, 
> but I'm still
> inclined to think that it's not worth adding.

I chose the name each because it's a common idiom in a couple of 
other languages (Javascript, Ruby and Rust off the top of my 
head), and because I think it underlines the fact that each is 
meant to perform side-effectful operations.

>> exhaust iterates a range until it is exhausted. It also has the
>> nice feature that if range.front is callable, exhaust will call
>> it upon each iteration.
>> 
>> Range exhaust(Range)(Range r)
>> if (isInputRange!(Unqual!Range))
>> {
>> 
>>      while (!r.empty)
>>      {
>>          r.front();
>>          r.popFront();
>>      }
>> 
>>      return r;
>> }
>> 
>> //Writes "www.dlang.org". x is an empty MapResult range.
>> auto x = "www.dlang.org"
>>           .map!((c) { c.write; return false; })
>>           .exhaust;
>> 
>> //Prints []
>> [1, 2, 3].exhaust.writeln;
>
> The callable bit won't work. It'll just call front. You'd have 
> to do something
> like
>
> static if(isCallable!(ElementType!R))
>     r.front()();

I was having some trouble with writing exhaust and forgot all 
about ElementType. I'll change that.

> Also, if front were pure, then calling it and doing nothing 
> with its return
> value would result in a compilation error. The same goes if the 
> element type
> is a pure callable.

Is this true for all pure functions? That seems like kind of 
strange behaviour to me, and doesn't really make sense given the 
definition of functional purity.

> And even if this did work exactly as you intended. I think
> that assuming that someone exhausting the range would would 
> what front returns
> to be called is a bad idea. Maybe they do, maybe they don't, 
> I'd expect that
> in most cases, they wouldn't. If that's what they want, they 
> can call map
> before calling exhaust.

I think the original reason that somebody wanted exhaust was 
because map is lazy and they wanted a function which could take 
the result of map and consume it while calling front each time. 
Otherwise, there wouldn't be much reason to have this, as there 
is takeNone and popFrontN.

> So, you want to have a function which you pass something 
> (including a range)
> and then returns that same value after calling some other 
> function? Does this
> really buy you much over just splitting up the expression - 
> you're already
> giving a multline example anyway.

It gives you the advantage of not having to split your UFCS chain 
up, which I personally find valuable, and I think other people 
would as well. I think it's quite similar to the various 
side-effectful monads in Haskell, which don't do anything with 
their argument other than return it, but perform some operation 
with side-effects in the process. I'll try to think up a better 
example for this, because I think it can be quite useful in 
certain circumstances.

> And I think that this is a perfect example of something that 
> should just be
> done with foreach anyway. Not to mention, if you're calling 
> very many
> functions, you're going to need to use multiple lines, in which 
> case chaining
> the functions like that doesn't buy you much. All you end up 
> doing is taking
> what would normally be a sequence of statements and turned it 
> into one
> multiline statement. I don't think that this buys us much, 
> especially when
> it's just calling one function which does nothing on any object 
> in the chain.

See above. I think there is quite a high value in not having to 
define extra variables and split up your UFCS chain halfway 
through, which which is somewhat obfuscatory, I think.

> Why do you need tap? So that you can use an anonymous function? 
> If it had a
> name, you'd just use it with UFCS. I'd argue that this use case 
> is minimal
> enough that you might as well just give it a name and then use 
> UFCS if you
> really want to use UFCS, and if you want an anonymous function, 
> what's the
> real gain of chaining it with UFCS anyway? It makes the 
> expression much harder
> to read if you try and chain calls on the anonymous function.

One way in which tap can be useful is that you can perform some 
operations on data in the middle of a UFCS chain and then go 
about your business. It's probably the least useful of the four.

> UFCS' main purpose is making it so that a function can be 
> called on multiple
> types in the same manner (particularly where it could be a 
> member function in
> some cases and a free function in others), and it just so 
> happens to make
> function chaining cleaner in some cases. But there's no reason 
> to try and turn
> all function calls in UFCS calls, and I think that perform and 
> tap are taking
> it too far.

Personally, I prefer function-chaining style more, as I think 
it's more aesthetic and more amenable to Walter's notion of 
component programming. For someone who doesn't use UFCS that 
much, these functions will seem almost useless, as their entire 
functionality can be duplicated by using some other construct. I 
wrote them to allow more versatile UFCS chains, so you don't have 
to break them up.

I think that people who heavily use UFCS, on the other hand, will 
find these quite useful in different situations.