[OT] Algorithm question

Mon May 1 03:22:19 PDT 2017

On Monday, 1 May 2017 at 09:01:33 UTC, Ola Fosheim Grøstad wrote:
> array of indexes and mutate that instead, still O(N). If you 
> cache in a heap you get O(N log N).

To avoid confusion: I didn't mean a literal "heap", but some kind 
of search structure that tracks and draws from unused indice 
ranges in a way that does not require O(N) initialization and can 
be implemented as a single array on the stack. I don't remember 
what it is called, but you draw on each node based on the number 
of children (kind of like the ones used in markov-chain type 
situations). I assume this can be maintained in O(log N) per 
iteration with O(1) initialization.

E.g.

1. push_indices(0,N-1) //data (0,N-1)
2. i = draw_and_remove_random_index() // data could now be 
(0,i-1), (i+1,N-1)
3. if( i == -1 ) return
4. dostuff(i)
5. goto 2

Even though it is O(N log N) it could outperform other solutions 
if you only want to draw a small number of elements. And a small 
array with a linear search (tracking used indices) would 
outperform that if you only want to draw say 8 elements. If you 
want to draw close to N elements then you would be better off 
just randomizing a full array (with elements from 0 to N-1). Or 
you could do all of these, start with linear search, then convert 
to a tree like search, then convert to full enumeration.

In general I don't think asymptotical analysis will do much good. 
I think you need to look on actual costs for typical N, P and 
different distributions of satisfied P, e.g.

cost of loop iteration: 10
average cost of P(x)=>false: 1
average cost of P(x)=>true: 1000
probability for P(a[i]) for each i