random cover of a range
Steven Schveighoffer
schveiguy at yahoo.com
Thu Feb 12 12:01:01 PST 2009
"Andrei Alexandrescu" wrote
> Ok, following Leonardo's and Denis' ideas on covering an immutable array
> in random order, I thought about it and came up with one algorithm, in
> addition to the obvious one that stores one integer per element (see
> Denis' code).
>
> Here's the algorithm. It stores one bit per element and takes O(n log n)
> time to cover a range of length n.
>
> 1. Given a random-access range r of length n, create a bitmap with one bit
> per element in r. The range is initially all zeros.
>
> 2. At each step, select a random index in the _original_ array, i.e. a
> random number in 0 .. r.length. Then walk upwards (with wrapping at the
> end of the range) until you find a bit = 0 in the bitmap. Mark that bit =
> 1 and return the found element.
>
> 3. The cover ends when all bits in the bitmap are 1.
>
> I did a ballpark complexity estimate by approximating the average number
> of steps in (2) with n / (n - k), where k is the number of 1s in the
> bitmap. IIRC the average number of steps to hit a 1 in a binomial
> distribution with p = (n - k) / n is 1/p. Right? In that case, the total
> number of steps taken by the algo above is:
>
> N = n / 1 + n / 2 + ... + n / (n - 1) = n * (1 + 1/2 + 1/3 + ... + 1/(n -
> 1)) < O (n log n)
>
> Makes sense?
OK, but you still have O(n) space requirement (albeit with a much smaller
constant).
There are some other optimizations you can make. For instance, finding a
bit can be optimized somewhat (i.e. if one integer's worth of bits is
0xffffffff, you can skip 32 bits, and when you find an int which contains
some 0 bits, you can even use a lookup table for 8 bits to tell you what the
next bit should be, then you can go 8 bits at a time looking for your empty
bit). This might reduce the "search time" to a more acceptable number. At
most, the search time for a bit will be n / 32 + 3.
I think the tradeoff might be significant between a larger space requirement
and an O(n) vs. O(nlgn) algo, but it would be highly dependent on the
dataset size.
There's one other "obvious" algorithm.. store pointers to the original
elements, or null if the element hasn't been swapped. The code would look
something like:
T [] _origArray;
T *[] _tmparray;
T *_headElem;
private T *getElem(int idx)
{
if(auto retval = _tmparray[idx])
return retval;
return &_origArray[idx];
}
void next()
{
int idx = randomInt(0.._tmparray.length);
_headElem = getElem(idx);
_tmparray[idx] = getElem(_tmparray.length - 1);
_tmparray.length = _tmparray.length - 1;
}
Same space requirements as the index generating range, but it may be faster
in terms of looking up values.
-Steve
More information about the Digitalmars-d
mailing list