Time for 2.067

Thu Feb 5 08:34:26 PST 2015

On Thursday, 5 February 2015 at 03:00:53 UTC, Andrei Alexandrescu 
wrote:
> On 2/2/15 2:42 PM, "Ulrich =?UTF-8?B?S8O8dHRsZXIi?= 
> <kuettler at gmail.com>" wrote:
>> On Friday, 30 January 2015 at 23:17:09 UTC, Andrei 
>> Alexandrescu wrote:
>>> Sorry, I thought that was in the bag. Keep current semantics, 
>>> call it
>>> chunkBy. Add the key to each group when the predicate is 
>>> unary. Make
>>> sure aggregate() works nice with chunkBy().
>>
>> I might miss some information on this, so please forgive my 
>> naive
>> question. Your requirements seem to be contradictory to me.
>>
>> 1. aggregate expects a range of ranges
>
> Probably we need to change that because aggregate should 
> integrate seamlessly with chunkBy.
>
>> 2. you ask chunkBy to return something that is not a range of 
>> ranges
>
> Yah.
>
>> 3. you ask chunkBy to play along nicely with aggregate
>
> Yah.
>
>> There are certainly ways to make this work. Adding a special 
>> version of
>> aggregate comes to mind. However, I fail to see the rational 
>> behind this.
>
> Rationale as discussed is that the key value for each group is 
> useful information. Returning a range of ranges would waste 
> that information forcing e.g. its recomputation.
>

I understand and agree. My suggestion aims to avoid this 
particular waste. See below.

>> To me the beauty of range is the composibility of "simple" 
>> constructs to
>> create complex behavior. The current chunkBy does not need to 
>> be changed
>> to "add the key to each group when the predicate is unary":
>>
>>  r.map!(pred, "a")
>>   .chunkBy!("a[0]")
>>   .map!(inner => tuple(inner.front[0], inner.map!"a[1]"));
>>
>> So I'd like to know why the above is inferior to a rework of 
>> the
>> chunkBy's implementation. Maybe this is a question for D.learn.
>
> Wouldn't that force recomputation if a more complex expression 
> replaced a[0]?

I do not think you ever want to replace a[0] here. In the code 
above the (original) predicate to chunkBy is pred. The idea is to 
evaluate the predicate outside of chunkBy. Create a range of 
tuples from the original range, chunk the range of tuples and 
construct the desired result from the chunked range of tuples.

// create a range of `tuple(pred(a), a)`
r.map!(pred, "a")

// chunk the range of tuples based of the first tuple element
// this results in a range of ranges of tuples
    .chunkBy!("a[0]")

// convert the inner ranges of tuples to a tuple of the predicate 
applied and the appropriate range
    .map!(inner => tuple(inner.front[0], inner.map!"a[1]"));

The construction of a range of tuples is not for free. On the 
bright side:

* you only do it when you need it
* if your predicate is that heavy, you might want to precompute 
it anyway
* a modified chunkBy is not exactly free either (and you pay the 
price even if you do not need the key value)

Now I learned that map is very lazy and applies the function 
inside front(). Thus, the above might actually result in multiple 
evaluations of the predicate. Luckily, there is the new cache 
function:

auto chunkByStar(alias pred, Range)(Range r)
{
return r.map!(pred, "a")
    .cache
    .chunkBy!("a[0]")
    .map!(inner => tuple(inner.front[0], inner.map!"a[1]"));
}

My point here is, we can construct a version of chunkBy that does 
not waste the key value with modest means. With great power comes 
great flexibility. I wanted to sneak this in as an example, 
because it is not clear what eventual users might actually need.

On the other hand there is no limit to the special cases we could 
add. aggregate might not be the only function to work with 
chunkBy. And even an aggregate function that takes a tuple of a 
range and something else and only uses the range seems wrong to 
me, given expressive the power D has. The transformation of the 
range is just on map away:

chunkByStar!(...)(r).map!"a[1]".aggregate!max

Then again, I might be missing something huge here.