nextPermutation and ranges

Thu Feb 7 12:04:06 PST 2013

On Thu, Feb 07, 2013 at 08:40:25PM +0100, Peter Alexander wrote:
[...]
> >1) To avoid excessive allocations, it would have to be a transient
> >range. Which means it may have unexpected results if you use it with
> >an algorithm that doesn't handle transient ranges correctly.
> 
> This has been discussed previously. bearophile suggested a policy to
> control whether the buffer is permuted in-place, with it defaulting
> to creating duplicates. The slow down from duplicates on DMD was
> ~3x.

Hmm. There's also the problem that there is no generic way to duplicate
a range.  Only arrays are guaranteed to support .dup and .idup. So you'd
have to allocate an array to store each permutation, and return that
instead of the original range. This could be a major problem (one might
expect to get elements of the same type as the original range, instead
of arrays!).

> >2) If the starting range is not sorted, then the permutation range
> >needs to make a copy of the original range so that it knows when all
> >permutations have been enumerated.
> 
> I sort on range creation (if not already sorted).

Good idea! But nevertheless the original range will be modified, so we
either have to live with that, or we still need to make a copy somehow.

> >If transience is acceptable, and the starting range is always sorted,
> >then it's almost trivial to write a wrapper range:
> >
> >	auto permutations(R)(R forwardRange)
> >		if (isForwardRange!R)
> >	{
> >		struct Permutations
> >		{
> >			R front;
> >			bool empty = false;
> >			this(R _src) { front = _src; }
> >			void popFront() { empty = !nextPermutation(front); }
> >		}
> >		return Permutations(forwardRange);
> >	}
> 
> This is missing some features:
> 
> - Bidirectionality (this complicates things).

Well, functionality beyond input ranges can, of course, be added on top.
Bidirectionality is trivial, actually: you just reverse the predicate to
nextPermutation:

	void popBack() { empty = !nextPermutation!"b < a"(front); }

> - Length (this complicates things, because it easily overflows).

Yeah, this one suffers from the same problem as using factorial.
Permutation ranges grow exponentially (O(n^n)). This means if length is
64-bit ulong, you will overflow once the input range is longer than 20
elements (21! > 2^64), which is a laughably small upper limit for
generic code.

> - Random-access (non-trivial, but useful)
> 
> A library solution should address all these.

Random-access is certainly non-trivial. In the case of the input having
all unique elements, indexing is not *too* hard (it's just the same as
using factorial to map to the permutations). But if you're going to
support non-unique elements, then you'll need to invent some other
scheme for mapping range elements to indices. (Are there research papers
on how to do this? I assume there's some kind of pattern to it... but it
may be non-trivial to implement!)

T

-- 
What do you get if you drop a piano down a mineshaft? A flat minor.