random k-sample of a file
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Thu Oct 9 18:25:11 PDT 2008
Andrei Alexandrescu wrote:
> Carlos wrote:
>> : You can't do a uniform random distribution without knowing the length.
>>
>> Probably true for other distribution.
>> Most certainly not true for uniform distribution (with a raisonable k)
>>
>> You can work on a subset of the file. Let say 1000 records.
>> The distribution being uniform, you can select (or eliminate),
>> a percentage of each subset and the results for the whole file
>> will be ok.
>
> I think you can do even nonuniform distributions. The number of samples
> seen should not influence your subsampling decision.
I think "the number of total samples..." is the correct statement.
Andrei
More information about the Digitalmars-d
mailing list