How to read fastly files ( I/O operation)

Wed Feb 6 03:15:21 PST 2013

On Wednesday, 6 February 2013 at 10:43:02 UTC, bioinfornatics 
wrote:
> instead to call mmFile opIndex to read ubyte by ubyte i tried 
> to put into a buffer array of length PAGESIZE.
>
> code here: http://dpaste.dzfl.pl/25ee34fc
>
> and is not faster for 12Go to parse i need 11 minutes. I do not 
> see how i could read faster the file!
>
> To remember fastxtoolkit need 2 min!

This might be stupid, but I see a "writeln" in your inner loop. 
You aren't slowed down just by your console by any chance?

If I were you, I'd start benching to try and see who is slowing 
you down.

I'd reorganize the code to parse a file that is, say 512Mb. The 
rationale being you can place it entirely at once. Then, I'd 
shift the logic from "fully proccess each charater before moving 
to the next character" to "make a full processing pass on the 
entire data structure, before moving to the next pass".

The steps I see that need to be measured are:

* Raw read of file
* Iterating on your file to extract it as a raw array of "Data" 
objects
* Processing the Data objects
* Outputting the data

Also,  (of course), you need to make sure you are compiling in 
release (might sound obvious, but you never know). Are you using 
dmd? I heard the "other" compilers are faster.

I'm going to try and see with some example files if I can't get 
something running faster.