GZip File Reading

Jonathan M Davis jmdavisProg at gmx.com
Thu Mar 10 00:18:53 PST 2011


On Thursday 10 March 2011 00:15:34 Lars T. Kyllingstad wrote:
> On Wed, 09 Mar 2011 21:34:29 -0800, Jonathan M Davis wrote:
> > On Wednesday 09 March 2011 21:10:59 Daniel Gibson wrote:
> >> Am 10.03.2011 05:53, schrieb dsimcha:
> >> > I noticed last night that Phobos actually has all the machinations
> >> > required for reading gzipped files, buried in etc.c.zlib. I've wanted
> >> > a high-level D interface for reading and writing compressed files
> >> > with an API similar to "normal" file I/O for a while. I'm thinking
> >> > about what the easiest/best design would be. At a high level there
> >> > are two designs:
> >> > 
> >> > 1. Hack std.stdio.file to support gzipped formats. This would allow
> >> > an identical interface for "normal" and compressed I/O. It would also
> >> > allow reuse of things like ByLine. However, it would require major
> >> > refactoring of File to decouple it from the C file I/O routines so
> >> > that it could call either the C or GZip ones depending on how it's
> >> > configured. Probably, it would make sense to make an interface that
> >> > wraps I/O functions and make an instance for C and one for gzip, with
> >> > bzip2 and other goodies possibly being added later.
> >> > 
> >> > 2. Write something completely separate. This would keep
> >> > std.stdio.File doing one thing well (wrapping C file I/O) but would
> >> > be more of a PITA for the user and possibly result in code
> >> > duplication.
> >> > 
> >> > I'd like to get some comments on what an appropriate API design and
> >> > implementation for writing gzipped files would be. Two key
> >> > requirements are that it must be as easy to use as std.stdio.File and
> >> > it must be easy to extend to support other single-file compression
> >> > formats like bz2.
> >> 
> >> Maybe a proper stream API would help. It could provide ByLine etc,
> >> could be used for any kind of compression format (as long as an
> >> appropriate input-stream is provided), ...
> >> (analogous for writing)
> > 
> > That was my thought. We really need proper streams...
> > 
> > The other potential issue with compressed files is that they can contain
> > directories and such.
> 
> Not gzip and bzip2 compressed files.  They only contain a single file.

Ah. True. I'm too used to always using tar with them. ;)

Actually, the fact that they're that way makes them _way_ more pleasant to deal 
with programmatically than zip...

- Jonathan M Davis


More information about the Digitalmars-d mailing list