shuffling lines in a stream
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Oct 10 14:23:27 PDT 2008
BCS wrote:
> Reply to Andrei,
>
>> BCS wrote:
>>
>
>>> I don't think there is any way to avoid storing the whole file
>>> because for a uniform sort there is a possibility that the last line
>>> will come out first.
>>>
>> I agree with the last paragraph, but lseeking seems overly
>> inefficient. Could you avoid that?
>>
>> Andrei
>>
>
> algorithmically, I don't think the lseek will matter,
I think it does. Essentially you impose random access on the input, or
copy to a medium that offers it.
gunzip --stdout bigfile.gz | shuffle
You'll have to compulsively store a copy of the input. Besides, random
access is kind of a dicey proposition on large files. Of course, only
measurement will show...
Andrei
More information about the Digitalmars-d
mailing list