fun project - improving calcHash

Anders Halager halager+dlang at gmail.com
Mon Jun 24 13:08:11 PDT 2013


On Monday, 24 June 2013 at 11:11:42 UTC, qznc wrote:
> On Sunday, 23 June 2013 at 21:22:40 UTC, Walter Bright wrote:
>> https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21
>>
>> Profiling shows the calcHash function is a significant 
>> contributor to compilation time (3.25% of total time). So 
>> making it faster is a win. Even making dmd 1% faster would be 
>> a nice win - all those little drops add up.
>>
>> There are many, many string hash functions findable through 
>> google. Anyone want to spend the effort to make a faster one? 
>> Remember, ya gotta prove it's faster!
>>
>> A nice timing test would be the time expending compiling 
>> Phobos. I would suggest that the 64 bit build of dmd be used 
>> for timing tests.
>>
>> Also, be careful, many of those hash functions on the 
>> intarnets have a license that makes it unusable for dmd.
>
> My first idea was to look at Python, since it requires a lot of 
> hash calculation dynamically and probably is one of the most 
> tuned implementations. Interesting how naive it is:
>
> http://hg.python.org/cpython/file/7aab60b70f90/Objects/object.c#l759
>
> Just a simple for loop over chars. No switch.
>
> Wikipedia: "Python Software Foundation License (PSFL) is a 
> BSD-style permissive free software license"

Python is one of the slower interpreted languages. It would be 
more interesting to look at luajit which actually does something 
clever.
If the string is at least 4 chars long it only hashes the first 4 
bytes, the last 4, the 4 starting at floor(len/2)-2 and the 4 
starting at floor(len/4)-1. Any of these may overlap of course 
but that isn't a problem.

The code is MIT licensed and can be found here:
http://repo.or.cz/w/luajit-2.0.git/blob/053041a9f47e3d341f98682ea1e4907a578e4920:/src/lj_str.c#l104

As others have mentioned it will be hard to improve the speed 
enough to get dmd 1% faster - but if anything can it's dirty 
tricks.


More information about the Digitalmars-d mailing list