[OT] Algorithm question

Mon May 1 10:02:17 PDT 2017

On Mon, May 01, 2017 at 08:42:05AM +0000, John Colvin via Digitalmars-d wrote:
[...]
> You can recover O(n) calls to P by caching the results.

Sorry, I forgot to specify that P(x) can be different each time.

Also, the input A may also be different each time, albeit remain the
same length (though the expectation is that it will be mostly the same
-- not that it matters, though, I don't think any of the proposed
solutions here count on that).

> You can recover O(n) for both P and rng by progressively removing
> elements from r or any other method that results in an easily
> samplable set of all elements of r not yet visited (and therefore
> possible candidates).
[...]

Yes, I realize that if r was mutable, there are existing simple
solutions. The problem is that it's not mutable, or rather, should not
be mutated by this selection algorithm.

[...]
> Yes. As a matter of fact it would be the logical extension of my
> algorithm above for when you can't or don't want to change r (again,
> only just thought of this, untested...):
> 
> auto r = ... /* elements of A */
> auto indices = iota(r.length).array;
> auto nRemaining = r.length;
> while (nRemaining) {
> 	auto i = uniform(0, nRemaining);
> 	if (P(r[indices[i]])) return r[indices[i]];
> 	else swap(indices[i], indices[--nRemaining]);
> }
> 
> You could also do the same thing but with an array of pointers to
> elements of r instead of indices.

The trouble with this is that you have to allocate an array of indices,
and r.length is rather large, so creating this array is O(n) each time.
But I think Ivan Kazmenko has an excellent idea to cache this index
array. Since, by symmetry, it doesn't matter if the starting indices
array was the identity permutation, we could just initialize this array
once, and the next time just skip to the loop directly (reusing the
previous (possibly partially) permuted indices as the starting point).
It does require O(n) memory, which is unfortunate, but at least we only
pay for that once.

I was hoping for something more along the lines of a random permutation
algorithm with sublinear (if not O(1)) space complexity that can return
elements of the permutation consecutively on-demand, like a lazy range.
Something along the lines of:

	auto seed = ... /* get random number */
	auto lazyRange = generatePermutation(seed, n);

where generatePermutation is some incremental algorithm that produces a
unique permutation of 0 .. n per seed value. Ideally, this algorithm
would only do as much work as is necessary to produce the first k
elements of the permutation, so that if the main loop terminates early
(we found an element satisfying P(x)) we don't waste time generating the
rest of the permutation.

T

-- 
"No, John.  I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev