Using iopipe to stream a gzipped file

Andrew aabrown24 at hotmail.com
Thu Jan 4 12:23:42 UTC 2018


On Thursday, 4 January 2018 at 12:15:27 UTC, Steven Schveighoffer 
wrote:
> On 1/4/18 7:01 AM, Andrew wrote:
>
>> Ah thank you, that makes sense. These types of files are 
>> compressed using the bgzip utility so that the file can be 
>> indexed meaning specific rows extracted quickly (there's more 
>> details of this here http://www.htslib.org/doc/tabix.html and 
>> the code can be found here: 
>> https://github.com/samtools/htslib/blob/develop/bgzf.c)
>
> Hm... that utility seems to say it will result in bgz file 
> extension? So this must be an extraction from one of those 
> files?
>
> In any case, I'll figure out how to deal with concatenated gzip 
> file, and update iopipe. Next version will focus on a bunch of 
> stuff relating to the 2 zip threads recently posted here.
>
> Thanks!
>
> -Steve

That would be really great for me, thank you! By default bgzip 
produces a file with the standard .gz extension. Looking at the 
code it adds an extra field to the standard gzip header:

/* BGZF/GZIP header (speciallized from RFC 1952; little endian):
  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
  | 31|139|  8|  4|              0|  0|255|      6| 66| 67|      
2|BLK_LEN|
  
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
   BGZF extension:
                 ^                              ^   ^   ^
                 |                              |   |   |
                FLG.EXTRA                     XLEN  B   C
   BGZF format is compatible with GZIP. It limits the size of each 
compressed
   block to 2^16 bytes and adds and an extra "BC" field in the 
gzip header which
   records the size.
*/

Thanks again!

Andrew


More information about the Digitalmars-d-learn mailing list