[OT] Algorithm question

Thu May 4 01:04:22 PDT 2017

On 03.05.2017 01:09, MysticZach wrote:
>
>
>> Counterexample: [1,1,1,0,0]
>>
>> Your algorithm returns 0 with probability 7/20, 1 with probability
>> 6/20 and 2 with probability 7/20.
>>
>> Note that there is a simple reason why your algorithm cannot work for
>> this case: it picks one of 20 numbers at random. The resulting
>> probability mass cannot be evenly distributed among the three
>> elements, because 20 is not divisible by 3.
>
> That's true. Two points, though: If the range of error is within
> 1/(n*(n-1)), with array length n,

It's not though. For example, [1,1,1,0,...,0] (length 29), you get 0 and 
2 each with probability 43/116, but 1 only with probability 30/116.

It might be interesting to figure out how far from uniform the 
distribution can get.

> it may be close enough for a given
> real world use case. 2. n is known to be large, which makes 1/(n*(n-1))
> even smaller.  You'd probably have to trade accuracy for performance. I
> wonder how much less performant a truly accurate algorithm would be.
> Also, whether there's a way to make an algorithm of this style
> completely accurate.

For an accurate algorithm based on (unconditional) uniformly random 
sampling, the number of outcomes from sampling needs to be divisible by 
all numbers from 1 to n. (Because each of those could conceivably be the 
number of elements satisfying the predicate.)