random cover of a range

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Fri Feb 13 16:45:21 PST 2009


Steven Schveighoffer wrote:
> "Steven Schveighoffer" wrote
>> "Andrei Alexandrescu" wrote
>>> Andrei Alexandrescu wrote:
>>>> Andrei Alexandrescu wrote:
>>>>> Given an array of length m, return a range that iterates the array in 
>>>>> random order such that the entire array is visited without going 
>>>>> through the same element more than once. Consume minimal amounts of 
>>>>> memory and time. Baseline solution: allocate an array of indices, 
>>>>> shuffle it, then follow indices stored in the array.
>>>> Ok, here's something that should work.
>>>>
>>>> Start with array a of length a.length, allocate a bitmap with one bit 
>>>> per element in the map. Also the number of unselected elements left in 
>>>> the array is kept, call it k.
>>>>
>>>> To implement next():
>>>>
>>>> 1. Walk from the beginning of the array to the first unselected element.
>>>>
>>>> 2. Roll a fair dice with k faces. If the dice chooses a specific face, 
>>>> choose that first unselected element. Otherwise continue.
>>>>
>>>> 3. Continue walking until the next unselected element.
>>>>
>>>> 4. Roll a fair dice with k-1 faces. If the dice chooses a specific face, 
>>>> choose that unselected element. Otherwise continue from step 3 using 
>>>> dices with k-1, k-2, ..., 1 faces.
>>>>
>>>> This has O(n log n) complexity. There is one obvious optimization: 
>>>> eliminate the first selected elements so we don't need to walk over them 
>>>> at each iteration. It's unclear to me how this affects complexity.
>>
>> This significantly increases the runtime by executing the random number 
>> algorithm up to k-1 times per element.
>>
>> How is this different than just a linear search through the sparse array 
>> for the random(0..k)th unused element?
>>
>> But I do concur that it provides a uniform map.  However, wouldn't the 
>> worst runtime be O(n^2)?  i.e. k + k-1 + k-2 + k-3... + 1 iterations == k 
>> *(k+1) / 2?  I'm not sure if average runtime is going to be O(nlgn).
> 
> Average runtime is still going to be O(n^2), because the average element 
> chosen should be k / 2 elements into the range.

So at some point there are k slots for grabs out of n. The number of 
steps taken is composed of some steps taken because of the already-taken 
slots, plus the number of steps taken because the dice tells me to 
ignore the slot even if it could be taken. The first number of steps is 
proportional to the number of taken slots, i.e. n-k. The second number 
of steps is proportional to the number of options, i.e. k. If you sum 
you get indeed O(n) per pass.

Now let us consider the following improvement: whenever an element at 
the front of the array is taken, we eliminate all elements in the front 
that were taken already. That way there's no more need to skip each time 
over already-occupied slots. In other words, it is an invariant that the 
first slot we'll look at is not taken. How does this affect complexity?


Andrei



More information about the Digitalmars-d mailing list