groupBy/chunkBy redux

Peter Alexander via Digitalmars-d digitalmars-d at puremagic.com
Fri Feb 13 15:45:47 PST 2015


On Friday, 13 February 2015 at 18:32:35 UTC, Andrei Alexandrescu 
wrote:
> * Perhaps rename groupBy to chunkBy. People coming from SQL and 
> other languages might expect groupBy to do hash-based grouping.

Agreed.


> * The unary function implementation must return for each group 
> a tuple consisting of the key and the lazy range of values. The 
> binary function implementation should continue to only return 
> the lazy range of values.

Is the purpose of this just to avoid the user potentially needing 
to evaluate the key function twice?


> * SortedRange should add a method called group(). Invoked with 
> no predicate, group() should do what chunkBy does, using the 
> sorting predicate.

Will need to be called something else since there may be existing 
code trying to call std.algorithm.group using UFCS. This would 
change its behaviour.


> * aggregate() should detect the two kinds of results per group 
> (well, chunk) and process them accordingly: for unary-predicate 
> chunks, pass the key through and only process the lazy range. 
> Meaning:
>
> auto data = [
>   tuple("John", 100),
>   tuple("John", 35),
>   tuple("Jane", 200),
>   tuple("Jane", 87),
> ];
> auto r = data.chunkBy!(x => x[0]).aggregate!sum;
>
> yields a range of tuples: tuple("John", 135), tuple("Jane", 
> 187).

Not sure I understand how this is meant to work.

With your second bullet implemented, data.chunkBy!(x => x[0]) 
will return:

tuple("John", [tuple("John", 100), tuple("John", 35)]),
tuple("Jane", [tuple("Jane", 200), tuple("Jane", 87)])

(here [...] denotes the sub-range, not an array).

So aggregate will ignore the key part, but how does it know to 
ignore the name in sub-ranges?


More information about the Digitalmars-d mailing list