Processing a gzipped csv-file by line-by-line
Jon Degenhardt via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed May 10 23:13:01 PDT 2017
On Wednesday, 10 May 2017 at 22:20:52 UTC, Nordlöw wrote:
> What's fastest way to on-the-fly-decompress and process a
> gzipped csv-fil line by line?
>
> Is it possible to combine
>
> http://dlang.org/phobos/std_zlib.html
>
> with some stream variant of
>
> File(path).byLineFast
>
> ?
I was curious what byLineFast was, I'm guessing it's from here:
https://github.com/biod/BioD/blob/master/bio/core/utils/bylinefast.d.
I didn't test it, but it appears it may pre-date the speed
improvements made to std.stdio.byLine perhaps a year and a half
ago. If so, it might be worth comparing it to the current Phobos
version, and of course iopipe.
As mentioned in one of the other replies, byLine and variants
aren't appropriate for CSV with escapes. For that, a real CSV
parser is needed. As an alternative, run a converter that
converts from csv to another format.
--Jon
More information about the Digitalmars-d-learn
mailing list