Article: Increasing the D Compiler Speed by Over 75%

Dmitry Olshansky dmitry.olsh at gmail.com
Fri Jul 26 03:47:30 PDT 2013


26-Jul-2013 01:25, Walter Bright пишет:
> On 7/25/2013 1:00 PM, bearophile wrote:
>> Walter Bright:
>>
>>> It's not the hashing that's slow. It's the lookup that is.
>>
>> By "different hashing scheme" I meant different strategies in
>> resolving hash
>> collisions, likes double hashing, internal hashing, cuckoo hashing,
>> and so on
>> and on. Maybe one of such alternative strategies is more fit for the
>> needs of
>> dmd compilation. (I think that currently the Python dicts are using a
>> hashing
>> strategy different from the built-in dictionaries of D. The Python
>> style of
>> hashing was implemented in D some months ago, but I don't remember
>> what happened
>> to that project later).
>
> Hash collisions are not the problem - I sized the hash bucket array to
> make it fairly sparse. Neither is the hash algorithm.
>
> The slowness was in the frackin' "convert the hash to an index in the
> bucket", which is a modulus operation.

Then it's past due to finally stop the madness of modulo prime table and 
use a power of 2 size. In essence what modulo prime does is simply 
enhancing the quality of your hash function w.r.t. collisions (it helps 
to distribute values more evenly).

Any of new decent hashes are good enough to work with plain slice the 
lower bits approach.

>
> Also, computing the hash is done exactly once, in the lexer.

All the more reason to use good hash function and kill the modulo prime.

> Thereafter,
> all identifiers are known only by their handles, which are (not
> coincidentally) the pointer to the identifier, and by its very nature is
> unique.
>


-- 
Dmitry Olshansky


More information about the Digitalmars-d-announce mailing list