shuffling lines in a stream

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Oct 9 22:14:47 PDT 2008


BCS wrote:
> Reply to Andrei,
> 
>> The high traffic enjoyed by "random k-sample of a file" confirms that
>> plenty of people around here have idle cycles ready to be spent on fun
>> tasks.
>>
>> In that vein, here's another problem. Say you want to randomly shuffle
>> lines in a stream. You don't know the size beforehand, and one pass is
>> preferable. The stream can be larger than the working memory, but not
>> an unbounded amount of times larger. Also lines in the stream can be
>> unbounded, but it is guaranteed that at least a nonzero fraction of
>> the file fits in working memory.
>>
>> You can use temporary files, but of course minimizing their total size
>> is recommended. Speed is of course important. Draft your solution and
>> explain why you think it's correct and advantageous.
>>
>> In case you hate random stuff, just wait; the next problem won't have
>> any random element in it :o).
>>
>> Andrei
>>
> 
> Um... I'm not following that.

Very simple: given a file, just randomly change the order of its lines 
and output them. For example (assume read from standard input):

$ ./shuffle
a
b
c
^D
b
a
c
$ ./shuffle
a
b
c
^D
b
c
a
$ _


Andrei



More information about the Digitalmars-d mailing list