How to read fastly files ( I/O operation)
monarch_dodra
monarchdodra at gmail.com
Wed Feb 6 04:33:13 PST 2013
On Wednesday, 6 February 2013 at 11:15:22 UTC, monarch_dodra
wrote:
> I'm going to try and see with some example files if I can't get
> something running faster.
Benchmarking and tweaking, I was able to find 3 things that
speeds up your program:
1) Make the computeLocal a compile time constant. This will give
you a tinsy bit of performance. Depends on if you plan to make it
a run-time argument switch I guess.
2) Makes things about 10%-20% faster:
Your "nucleic" and "amino" hash tables map a character to an
index. However, given the range of the characters ('A' to 'Z'),
you are better off doing a flat array, where each index
represents a character, eg: A is index 0, B is index 1. This way,
lookup is a simple array indexing, as opposed to a hash table
indexing.
You may even get a bigger bang for your buck by simply giving
your "_stats" structure a simple "A is index 0, B is index 1",
and only "re-order" the data at the end, when you want to read
it. (I haven't done this though).
3) Makes things about 100% faster (ran in half the time on my
machine): I don't know how mmFile works, but a simple File +
"rawRead" seems to get the job done fast. Also, instead of
keeping track of an (several) indexes, I merely keep a single
slice. The only thing I care about, is if my slice is empty, in
which case I re-fill it.
The modified code is here. I'm apparently getting the same output
you are, but that doesn't mean there might not be bugs in it. For
example, I noticed that you don't strip leading whites, if any,
before the first read.
http://dpaste.dzfl.pl/9b9353b8
----
I'd be tempted to re-write the parser using a "byLine" approach,
since my quick reading about fastq seems to imply it is a line
based format. Or just plain try to write a parser from scratch,
putting my own logic and thought into it (all I did was modify
your code, without caring about the actual algorithm)
More information about the Digitalmars-d-learn
mailing list