fun project - improving calcHash

Mon Jun 24 15:56:33 PDT 2013

On 6/24/13 5:56 PM, Anders Halager wrote:
> On Monday, 24 June 2013 at 20:19:31 UTC, Walter Bright wrote:
>> On 6/24/2013 1:08 PM, Anders Halager wrote:
>>> Python is one of the slower interpreted languages. It would be more
>>> interesting
>>> to look at luajit which actually does something clever.
>>> If the string is at least 4 chars long it only hashes the first 4
>>> bytes, the
>>> last 4, the 4 starting at floor(len/2)-2 and the 4 starting at
>>> floor(len/4)-1.
>>> Any of these may overlap of course but that isn't a problem.
>>
>> I used that method back in the 1980's, it was well known then, but
>> perhaps has drifted into obscurity. In fact, I still use it for
>> hashing identifiers in DMC++.
>
> I can't imagine all the clever (even if outdated) tricks that have
> disappeared with retired old-timers :)
>
> I haven't set up anything for testing but if someone wants to try I've
> made a quick patch here: http://dpaste.com/hold/1268958/

This is significantly faster than anything submitted thus far. Compiled 
alongside Juan Manuel Cabo's submission, the results are as follows:

Times hashing words:

	Unchanged : 1386 ms
	One switch: 1338 ms
	Only add : 1354 ms
	Anders Haliger : 933 ms

Times hashing entire lines:

	Unchanged : 335 ms
	One switch: 332 ms
	Only add : 331 ms
	Anders Haliger : 125 ms

Wonder how much faster can it get?

-- 

Andrew Edwards
--------------------
http://www.akeron.co
auto getAddress() {
     string location = "@", period = ".";
     return ("info" ~ location ~ "afidem" ~ period ~ "org");
}