Help optimizing UnCompress for gzipped files

Steven Schveighoffer schveiguy at yahoo.com
Fri Jan 5 22:04:52 UTC 2018


On 1/5/18 3:09 PM, Christian Köstlin wrote:
> On 05.01.18 15:39, Steven Schveighoffer wrote:
>> Yeah, I guess most of the bottlenecks are inside libz, or the memory
>> allocator. There isn't much optimization to be done in the main program
>> itself.
>>
>> D compiles just the same as C. So theoretically you should be able to
>> get the same performance with a ported version of your C code. It's
>> worth a shot.
> I added another version that tries to do the "same" as the c version
> using mallocator, but i am still way off, perhaps its creating too many
> ranges on the underlying array. but its around the same speed as your
> great iopipe thing.

Hm... I think really there is some magic initial state of the allocator, 
and that's what allows it to go so fast.

One thing about the D version, because druntime is also using malloc 
(the GC is backed by malloc'd data after all), the initial state of the 
heap is quite different from when you start in C. It may be impossible 
or nearly impossible to duplicate the performance. But the flipside (if 
this is indeed the case) is that you won't see the same performance in a 
real-world app anyway, even in C.

One thing to try, you preallocate the ENTIRE buffer. This only works if 
you know how many bytes it will decompress to (not always possible), but 
it will take the allocator out of the equation completely. And it's 
probably going to be the most efficient method (you aren't leaving 
behind smaller unused blocks when you realloc). If for some reason we 
can't beat/tie the C version doing that, then something else is going on.

> My solution does have the same memory leak, as I am not sure how to best
> get the memory out of the FastAppender so that it is automagically
> cleaned up. Perhaps if we get rc things, this gets easier?

I've been giving some thought to this. I think iopipe needs some buffer 
management primitives that allow you to finagle the buffer. I've been 
needing this for some time anyway (for file seeking). Right now, the 
buffer itself is buried in the chain, so it's hard to get at the actual 
buffer.

Alternatively, I probably also need to give some thought to a mechanism 
that auto-frees the memory when it can tell nobody is still using the 
iopipe. Given that iopipe's signature feature is direct buffer access, 
this would mean anything that uses such a feature would have to be unsafe.

-Steve


More information about the Digitalmars-d-learn mailing list