Help optimizing UnCompress for gzipped files

Christian Köstlin christian.koestlin at gmail.com
Fri Jan 5 20:09:17 UTC 2018


On 05.01.18 15:39, Steven Schveighoffer wrote:
> Yeah, I guess most of the bottlenecks are inside libz, or the memory
> allocator. There isn't much optimization to be done in the main program
> itself.
>
> D compiles just the same as C. So theoretically you should be able to
> get the same performance with a ported version of your C code. It's
> worth a shot.
I added another version that tries to do the "same" as the c version
using mallocator, but i am still way off, perhaps its creating too many
ranges on the underlying array. but its around the same speed as your
great iopipe thing.
My solution does have the same memory leak, as I am not sure how to best
get the memory out of the FastAppender so that it is automagically
cleaned up. Perhaps if we get rc things, this gets easier?
I updated: https://github.com/gizmomogwai/benchmarks/tree/master/gunzip
with the newest numbers on my machine, but I think your iopipe solution
is the best one we can get at the moment!

>> rust is doing quite well there
> 
> I'll say a few words of caution here:
> 
> 1. Almost all of these tests use the same C library to unzip. So it's
> really not a test of the performance of decompression, but the
> performance of memory management. And it appears that any test using
> malloc/realloc is in a different tier. Presumably because of the lack of
> copies (as discussed earlier).
> 2. Your rust test (I think, I'm not sure) is testing 2 things in the
> same run, which could potentially have dramatic consequences for the
> second test. For instance, it could already have all the required memory
> blocks ready, and the allocation strategy suddenly gets better. Or maybe
> there is some kind of caching of the input being done. I think you have
> a fairer test for the second option by running it in a separate program.
> I've never used rust, so I don't know what exactly your code is doing.
> 3. It's hard to make a decision based on such microbenchmarks as to
> which solution is "better" in an actual real-world program, especially
> when the state/usage of the memory allocator plays a huge role in this.
sure .. thats true




More information about the Digitalmars-d-learn mailing list