Article: Increasing the D Compiler Speed by Over 75%

Thu Jul 25 14:25:20 PDT 2013

On 7/25/2013 1:00 PM, bearophile wrote:
> Walter Bright:
>
>> It's not the hashing that's slow. It's the lookup that is.
>
> By "different hashing scheme" I meant different strategies in resolving hash
> collisions, likes double hashing, internal hashing, cuckoo hashing, and so on
> and on. Maybe one of such alternative strategies is more fit for the needs of
> dmd compilation. (I think that currently the Python dicts are using a hashing
> strategy different from the built-in dictionaries of D. The Python style of
> hashing was implemented in D some months ago, but I don't remember what happened
> to that project later).

Hash collisions are not the problem - I sized the hash bucket array to make it 
fairly sparse. Neither is the hash algorithm.

The slowness was in the frackin' "convert the hash to an index in the bucket", 
which is a modulus operation.

Also, computing the hash is done exactly once, in the lexer. Thereafter, all 
identifiers are known only by their handles, which are (not coincidentally) the 
pointer to the identifier, and by its very nature is unique.