Help optimizing UnCompress for gzipped files
Steven Schveighoffer
schveiguy at yahoo.com
Fri Jan 5 14:39:19 UTC 2018
On 1/5/18 1:01 AM, Christian Köstlin wrote:
> On 04.01.18 20:46, Steven Schveighoffer wrote:
>> On 1/4/18 1:57 PM, Christian Köstlin wrote:
>>> Thanks Steve,
>>> this runs now faster, I will update the table.
>>
>> Still a bit irked that I can't match the C speed :/
>>
>> But, I can't get your C speed to duplicate on my mac even with gcc, so
>> I'm not sure where to start. I find it interesting that you are not
>> using any optimization flags for gcc.
> I guess, the code in my program is small enough that the optimize flags
> do not matter... most of the stuff is pulled from libz? Which is
> dynamically linked against /usr/lib/libz.1.dylib.
Yeah, I guess most of the bottlenecks are inside libz, or the memory
allocator. There isn't much optimization to be done in the main program
itself.
> I also cannot understand what I should do more (will try realloc with
> Mallocator) for the dlang-low-level variant to get to the c speed.
D compiles just the same as C. So theoretically you should be able to
get the same performance with a ported version of your C code. It's
worth a shot.
> rust is doing quite well there
I'll say a few words of caution here:
1. Almost all of these tests use the same C library to unzip. So it's
really not a test of the performance of decompression, but the
performance of memory management. And it appears that any test using
malloc/realloc is in a different tier. Presumably because of the lack of
copies (as discussed earlier).
2. Your rust test (I think, I'm not sure) is testing 2 things in the
same run, which could potentially have dramatic consequences for the
second test. For instance, it could already have all the required memory
blocks ready, and the allocation strategy suddenly gets better. Or maybe
there is some kind of caching of the input being done. I think you have
a fairer test for the second option by running it in a separate program.
I've never used rust, so I don't know what exactly your code is doing.
3. It's hard to make a decision based on such microbenchmarks as to
which solution is "better" in an actual real-world program, especially
when the state/usage of the memory allocator plays a huge role in this.
-Steve
More information about the Digitalmars-d-learn
mailing list