About built-in AAs

Josh Simmons simmons.44 at gmail.com
Tue Aug 16 20:10:38 PDT 2011


On Wed, Aug 17, 2011 at 12:40 PM, Andrew Wiley <wiley.andrew.j at gmail.com> wrote:
>
>
> On Tue, Aug 16, 2011 at 7:29 PM, bearophile <bearophileHUGS at lycos.com>
> wrote:
>>
>> Walter Bright:
>>
>> > > I think there are search trees like the Red-Black ones that guarantee
>> > > a O(n ln n) worst case. I am wrong?
>> >
>> > Just feed it more data.
>>
>> If you feed it more data, even if all items pruce collision because they
>> all hash to the same bucket, if you use Red-Black trees to handle the items
>> in the same bucket you keep having a O(n ln n) behaviour, that's usually
>> fast enough. With Python and the new D AAs you instead get a quadratic one.
>> This quadratic behaviour gives troubles way before the physical RAM is
>> exhausted.
>
> The problem here is that algorithmic complexity doesn't really tell the
> whole story. We can talk about what "should" be faster all day, but unless
> we can prove that there's really a problem here, this doesn't seem like it's
> worth worrying about.
> And if it was changed in the past because the linked lists turned out to be
> faster, I'd guess that actual benchmarking was probably used back then to
> determine which was better.
>

There's a lot more to making a hash-map robust than changing the
lookup scheme. There's also a large variety of collision resolution
methods that all have their pros and cons depending on the data-set
and lookup patterns you're dealing with. For a built in AA system I
feel simplicity is a much more achievable goal than trying to solve
everybody's specific problems. Hence I feel that the current solution
is probably a good choice.

Just my two cents anyway.
Josh.


More information about the Digitalmars-d mailing list