parallel unzip in progress

Jay Norwood jayn at prismnet.com
Mon Apr 2 22:27:07 PDT 2012


I'm working on a parallel unzip.  I started with phobos std.zip, 
but found that to be too monolithic.  I needed to separate out 
the tasks that get the directory entries, create the directory 
tree, get the compressed data, expand the data and create the 
uncompressed files on disk.  It currently unzips a 2GB directory 
struct in about 18 secs while 7zip takes around 55 secs. Only 
about 4 seconds of this is the creation of the directory 
structure and the expanding.  The other 14 secs is writing  the 
regular files.

The subtasks needed to be separated not only because of the need 
to run them in parallel, but also because the current std.zip 
implementation is a memory hog, keeping the whole compressed and 
expanded data sections in memory. I was running out of memory in 
a 32 bit application just attempting to unzip the test file with 
the std.zip operations.  The parallel version peaks at around 
150MB memory used during the operation.


  The parallel version is still missing the operation of restoring 
the original file attributes, and I see no example in the 
documents of what would normally be done.  Am I missing this 
somewhere? I'll have to dig around...




More information about the Digitalmars-d-learn mailing list