randomShuffle

Joseph Rushton Wakeling joseph.wakeling at webdrake.net
Mon Jun 3 10:35:11 PDT 2013


On 06/03/2013 07:00 PM, Diggory wrote:
> Thanks for testing before dismissing completely :P The way it returns results
> can be improved a lot by pre-allocating a range of the necessary size/using a
> range passed in.

Surely. :-)

> The complexity is O(N²) where N is the number of samples out of M. The algorithm
> is something I just thought of although I've possibly used it before and I'm
> sure someone else has thought of it too. It's quite simple, it effectively says
> "pick a value from the range except for this value" recursively but instead of
> dividing the range in two, it shifts the top part down first to make a
> contiguous range and then shifts the results which are past the split up one
> afterward.

It's an interesting little piece of invention.  I'll have to look around and see
if there's anything similar documented.  I've not seen anything in the
literature which has this particular property of O(N) random variates plus
O(N^2) calculation.

The fact that it's not sequential might have been a bit of an issue
historically, which maybe explains why I haven't come across something like it.

> I expect the phobos implementation to be something like O(M) in which case my
> method would be faster whenever N < sqrt(M) (with the optimisations)

No, the Phobos implementation is O(N) -- I wrote it :-)  See:
http://braingam.es/2012/07/sampling-d/

The API is Andrei's -- the original implementation used what Knuth refers to as
Algorithm S [which is really simple and easy to implement and check, but is
indeed O(M)], I updated it to use Algorithm D (which is complicated:-).

So, since randomSample takes kN time where k is constant, your algorithm will
probably win where N < k.

I found that the sampling time was about equal with N = 50 and your algorithm as
it's currently written.


More information about the Digitalmars-d-learn mailing list