Python dictionary inspired hash table implementation

Moritz Warning moritzwarning at web.de
Thu Jul 3 03:26:22 PDT 2008


On Thu, 03 Jul 2008 06:49:03 +0000, Manfred_Nowak wrote:

> bearophile wrote:
> 
>> creating benchmarks is a very tricky thing
> 
> Yes. And as one may recognize: Moritz designed the benchmarks in such a
> way,
> - that collisions at least aren't obvious - that the locality of two
> consecutive inserts/acesses seems to be close to maximum

There was no intention in designing the "benchmark" to get high speed for 
this particular implementation.
It's just the simplest approach to test for errors most people would come 
up to.
It's main aim is to detect bigger errors (and it already did).

> In addition: the space effciency of as low as 25% which Moritz' attempt
> allows may be inacceptable.

The build-in AA runs out of space on my machine (1GB)
when other implementations keep on working (GC was enabled, 20m uint 
pairs).
Apart from that, the Dictionary implementation is open for improvement.

> 
> Whithout some agreed scenario, i.e. specifications
> 
> - about space effciency
> - about static properties of the key spaces - about dynamic properties
> of the actually used set of keys and its insert/access sequences

That's a todo.
Don't expect me to begin testing with an iso certificated benchmark 
scenario from start up. ;-)


> every user of a dictionary will be able to claim, that his preferred
> implementation is better than any other.

As already was said and most people know, designing hash tables is a 
tricky thing.
You can mostly/always design them to be the fastest in the world for a 
particular testing method.

The main goal is to perform well on most everyday use cases.
Since the implementation is based on Pythons Dictionary
(which is aimed for a general use),
There is justified hope that it doesn't perform much worse
for other use cases as well.
But that's up to testing and and further improvements!


- mwarning



More information about the Digitalmars-d mailing list