Using iopipe to stream a gzipped file
Andrew
aabrown24 at hotmail.com
Thu Jan 4 12:23:42 UTC 2018
On Thursday, 4 January 2018 at 12:15:27 UTC, Steven Schveighoffer
wrote:
> On 1/4/18 7:01 AM, Andrew wrote:
>
>> Ah thank you, that makes sense. These types of files are
>> compressed using the bgzip utility so that the file can be
>> indexed meaning specific rows extracted quickly (there's more
>> details of this here http://www.htslib.org/doc/tabix.html and
>> the code can be found here:
>> https://github.com/samtools/htslib/blob/develop/bgzf.c)
>
> Hm... that utility seems to say it will result in bgz file
> extension? So this must be an extraction from one of those
> files?
>
> In any case, I'll figure out how to deal with concatenated gzip
> file, and update iopipe. Next version will focus on a bunch of
> stuff relating to the 2 zip threads recently posted here.
>
> Thanks!
>
> -Steve
That would be really great for me, thank you! By default bgzip
produces a file with the standard .gz extension. Looking at the
code it adds an extra field to the standard gzip header:
/* BGZF/GZIP header (speciallized from RFC 1952; little endian):
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 31|139| 8| 4| 0| 0|255| 6| 66| 67|
2|BLK_LEN|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
BGZF extension:
^ ^ ^ ^
| | | |
FLG.EXTRA XLEN B C
BGZF format is compatible with GZIP. It limits the size of each
compressed
block to 2^16 bytes and adds and an extra "BC" field in the
gzip header which
records the size.
*/
Thanks again!
Andrew
More information about the Digitalmars-d-learn
mailing list