Article: Increasing the D Compiler Speed by Over 75%
Dmitry Olshansky
dmitry.olsh at gmail.com
Tue Jul 30 11:02:14 PDT 2013
26-Jul-2013 23:17, Walter Bright пишет:
> On 7/26/2013 5:11 AM, Dmitry Olshansky wrote:
>> 26-Jul-2013 14:47, Dmitry Olshansky пишет:
>>> 26-Jul-2013 01:25, Walter Bright пишет:
>>
>>>> The slowness was in the frackin' "convert the hash to an index in the
>>>> bucket", which is a modulus operation.
>>>
>>> Then it's past due to finally stop the madness of modulo prime table and
>>> use a power of 2 size. In essence what modulo prime does is simply
>>> enhancing the quality of your hash function w.r.t. collisions (it helps
>>> to distribute values more evenly).
>>>
>>> Any of new decent hashes are good enough to work with plain slice the
>>> lower bits approach.
>>>
>>
>> To be more concrete:
>> Spooky hash
>> http://burtleburtle.net/bob/hash/spooky.html (Public domain)
>> S-box hash
>> http://home.comcast.net/~bretm/hash/10.html (Published paper)
>>
>> Or even a good ol' FNV (Public domain)
>> http://isthe.com/chongo/tech/comp/fnv/#FNV-1a
>>
>
> How about a pull request so we can try it out?
Preliminary pull is here:
https://github.com/D-Programming-Language/dmd/pull/2436
So far it looses a bit. I'm still playing with it, looking at load
factor, distribution of sizes, profiling.
So far observations are that SpookyHash is slower then the one that was
there thus stealing a few percents of speed. That is then hardly
regained by a faster lookup of a slot:
almost all of "large" tables were 31 in size and you have special case
for that anyway.
What bothers me is that while I've been hacking at this I couldn't shake
off the feeling that AA code assumes NO FULL HASH COLLISIONS at all?
Isn't that betting on luck (and a crude hash) a little too much (esp in
32 bit mode)?
That is e.g. in code pasted from aav.c Key is only a hash and there is
no way whatsoever to discern a full hash collision.
Value _aaGetRvalue(AA* aa, Key key)
{
//printf("_aaGetRvalue(key = %p)\n", key);
if (aa)
{
size_t i;
size_t len = aa->b_length;
if (len == 4)
i = (size_t)key & 3;
else if (len == 31)
i = (size_t)key % 31;
else
i = (size_t)key % len;
aaA* e = aa->b[i];
while (e)
{
if (key == e->key)
return e->value;
e = e->next;
}
}
return NULL; // not found
}
--
Dmitry Olshansky
More information about the Digitalmars-d-announce
mailing list