parsing fastq files with D

Marc Schütz via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Mar 24 06:38:32 PDT 2016


On Thursday, 24 March 2016 at 08:24:15 UTC, eastanon wrote:
> On Thursday, 24 March 2016 at 06:34:51 UTC, rikki cattermole 
> wrote:
>> As a little fun thing to do I implemented it for you.
>>
>> It won't allocate. Making this perfect for you.
>> With a bit of work you could make Result have buffers for 
>> result instead of using the input array allow for the source 
>> to be an input range itself.
>>
>> I made this up on dpaste and single quotes were not playing 
>> nicely there. So you'll see "\r"[0] as a workaround.
>
> Thank you very much. I think you have exposed me to  a number 
> of new concepts that I will go through and annotate the code 
> with.  I read all input from file as follows.
>
> string text = cast(string)std.file.read(inputfile);
> foreach(record;FastQRecord.parse(text)){
>    writeln(record);
> }
>
> </naivequestion>Does this mean that text is allocated to 
> memory? and is there a better way to read and process the 
> inputfile? </naivequestion>

Yes, it's read into your processes memory. You can use std.mmfile 
[1] to make things a bit more efficient. It will, too, read the 
data into memory, but it will do so in a way (memory mapping) 
that only loads what is actually accessed (everything in your 
case), and that allows the operating system to efficiently 
release and reload parts of it if memory runs low.

Unfortunately there is no example in the documentation, but it 
works like this (untested):

import std.mmfile;
auto file = new MmFile(inputfile);
string text = cast(string) file[];
...

[1] http://dlang.org/phobos/std_mmfile.html


More information about the Digitalmars-d-learn mailing list