Parsing and splitting textfile
Hugo Florentino
hugo at acdam.cu
Mon Feb 24 12:19:06 PST 2014
On Mon, 24 Feb 2014 19:08:16 +0000 (UTC), Justin Whear wrote:
>
> Specifically std.regex.splitter[1] creates a lazy range over the
> input.
> You can couple this with lazy file reading (e.g.
> `File("mailbox").byChunk
> (1024).joiner`).
>
Would something like this work? (I cannot test it right now)
auto themailbox = args[1];
immutable uint chunksize = 1024 * 64;
static auto re = regex(`\n\nFrom .+ at .+$`);
auto mailbox;
auto mail;
while (mailbox = File(themailbox).byChunk(chunksize).joiner) != EOF)
{
mail = splitter(mailbox, re);
}
If so, I have a couple of furter doubts:
Using splitter actually removes the expression from the string, how
could I reinsert it to the beginning of each resulting string in an
efficient way (i.e. avoiding copying something which is already loaded
in memory)?
I am seeing the splitter fuction returns a struct, how could I
progressively dump to disk each resulting string, removing it from the
struct, so that so that it does not end up having the full mailbox
loaded into memory, in this case as a struct?
Regards, Hugo
More information about the Digitalmars-d-learn
mailing list