[OT] The horizon of a stream
Nigel Sandever
nigelsandever at btconnect.com
Sun Oct 26 05:59:38 PDT 2008
On Sun, 26 Oct 2008 03:39:50 -0400, bearophile <bearophileHUGS at lycos.com> wrote:
> Nigel Sandever:
>
> >I did try that (using md5), but the penalty in Perl was horrible,<
>
> This is a D newsgroup, so use D, it allows you to manage bits more
efficiently.
>
Sorry. No disrespect meant to D. I always prototype in Perl and then convert to
C or D if I need performance. I'm just more familiar with Perl.
>
> >I used (a slightly modified version of) 2of12inf available from<
>
> That's a quite complex file, so I suggest something simpler, as this after a
cleaning of the non ASCII words:
> http://www.norvig.com/big.txt
I don't know what is "complex" about a 1 word per line, 81536 line dictionary
file?
Or how having everyone clean up Conan Doyle would be simpler?
If you have Perl, you can produce a suitable testfile from any 1 word per line
dictionary with the command line:
perl -l12n0777aF/\n/ -ne'print $F[rand @F] for 1..4e8' yourdict >thedata
With the 2of12inf dictionary file, 4e8 produces a file a little under 4GB in a
round 10 minutes. YMWV depending upon the average length of the lines in your
local dict.
Of course the won't all be the same as mine, or anyone elses, but given the
random nature, the results will be broadly comparible.
My D foo is too rusty to try and write that in D. Especially for a D audience :)
I'm sure one of you guys can knock that up in the blink of an eye.
>
> Bye,
> bearophile
More information about the Digitalmars-d
mailing list