Processing a gzipped csv-file by line-by-line

Jon Degenhardt via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Wed May 10 23:13:01 PDT 2017


On Wednesday, 10 May 2017 at 22:20:52 UTC, Nordlöw wrote:
> What's fastest way to on-the-fly-decompress and process a 
> gzipped csv-fil line by line?
>
> Is it possible to combine
>
> http://dlang.org/phobos/std_zlib.html
>
> with some stream variant of
>
> File(path).byLineFast
>
> ?

I was curious what byLineFast was, I'm guessing it's from here: 
https://github.com/biod/BioD/blob/master/bio/core/utils/bylinefast.d.

I didn't test it, but it appears it may pre-date the speed 
improvements made to std.stdio.byLine perhaps a year and a half 
ago. If so, it might be worth comparing it to the current Phobos 
version, and of course iopipe.

As mentioned in one of the other replies, byLine and variants 
aren't appropriate for CSV with escapes. For that, a real CSV 
parser is needed. As an alternative, run a converter that 
converts from csv to another format.

--Jon


More information about the Digitalmars-d-learn mailing list