[OT] shuffling lines in a stream
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Oct 10 13:54:18 PDT 2008
Sean Kelly wrote:
> It would be slower than the seeking option, but something like a
> randomized mergesort would work as well. If the program can buffer k
> lines in a file containing n lines, then read the first k lines into
> memory, shuffle them, and write them out to a temporary file. Repeat
> until the input is exhausted. Now randomly pick two of the temporary
> files and randomly merge them. Repeat until two temporary files remain,
> then output the result of the final random merge to the screen.
>
> For small files (ie. where n<k) the file would be read, shuffled in
> memory, and printed to the screen, assuming the proper checks were in
> place.
I think I found a problem with your solution. Consider you break the
file in three chunks: a, b, c. Then you pick at random b and c and
randomly merge them. The problem is, you make early the decision that
nothing in a will appear in the first two thirds of the result. So the
quality of randomness suffers. How would you address that?
Andrei
More information about the Digitalmars-d
mailing list