[WORK] groupBy is in! Next: aggregate

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Fri Jan 23 14:17:59 PST 2015


On Fri, Jan 23, 2015 at 09:56:11PM +0000, MattCoder via Digitalmars-d wrote:
> On Friday, 23 January 2015 at 18:08:30 UTC, Andrei Alexandrescu wrote:
> >So H.S. Teoh awesomely took
> >https://github.com/D-Programming-Language/phobos/pull/2878 to
> >completion.  We now have a working and fast relational "group by"
> >facility.
> >
> >See it at work!
> >
> >----
> >#!/usr/bin/rdmd
> >
> >void main()
> >{
> >    import std.algorithm, std.stdio;
> >    [293, 453, 600, 929, 339, 812, 222, 680, 529, 768]
> >        .groupBy!(a => a & 1)
> >        .writeln;
> >}
> >----
> >
> >[[293, 453], [600], [929, 339], [812, 222, 680], [529], [768]]
> 
> Sorry if this a dumb question, but since you're grouping an array
> according some rule, this shouldn't be:
> 
> [293, 453, 929, 339, 529][600, 812, 222, 680, 768]
> 
> ?
> 
> Because then you have the array of "trues" and "falses" according the
> condition (a & 1).
[...]

It's kind of a misnomer, because it actually only considers *consecutive
runs* of equivalent elements; it doesn't look at the whole range before
deciding what goes in which group. So technically it should be called
consecutiveGroupBy or consecutivePartitionBy, but that's too big a
mouthful to be a usable name.

What you describe could be an interesting candidate to add, though. It
could iterate over distinct values of the predicate, and traverse the
forward range (input ranges obviously can't work unless you allocate,
which makes it no longer lazy) each time. This, however, has O(n*k)
complexity where k is the number of distinct predicate values. If it's
anything bigger than bool or a relatively small enum, it would be
impractical. Moreover, the requirement to traverse the range multiple
times kinda sux; you might as well just call filter() with different
expected values yourself, in a loop. In fact, you could ostensibly
implement it something along these lines:

	auto globalPartition(alias pred, R)(R range) {
		alias Values = typeof(pred(range.front, range.front));

		return iota(Values.min, Values.max)
			.map!(v => range.save.filter!(pred(e) == v));
	}


T

-- 
Who told you to swim in Crocodile Lake without life insurance??


More information about the Digitalmars-d mailing list