AA performance again

bearophile bearophileHUGS at lycos.com
Mon Jun 9 13:35:40 PDT 2008


Steven Schveighoffer:
>To give you an idea of the speedup, the builtin AA's take 40.5 seconds to do 10m inserts on my system.<

I see, quite different from about 15 s.


>print out the time it takes for each million inserts :)

Okay, for 10 million inserts:

0.44 s
0.75 s
1.20 s
1.34 s
2.09 s
2.62 s
2.33 s
1.92 s
4.67 s
5.48 s
Total time: 24.0 s

The code I have used:

import std.conv: toUint;
import d.time:clock;//www.fantascienza.net/leonardo/so/libs_d.zip
void main() {
    size_t[long] aa;
    for (uint j; j < 10; ++j) {
        float t = clock();
        uint min = 1_000_000 * j;
        uint max = 1_000_000 * (j + 1);
        for (uint i = min; i < max; ++i)
            aa[i] = 0;
        printf("%f\n", clock() - t);
    }
}


>You might also want to try setting the value of each element differently. I'm not sure it will make a difference, but perhaps python is doing something clever if all the values are 0.<

Okay, but I think the results will not change much.

The D timings show very little changes:
 1_000_000   0.48 s
 1_000_000   1.26 s
 5_000_000   5.97 s
10_000_000  24.14 s

The D code I have used:

import std.conv: toUint;
void main(string[] args) {
    uint n = toUint(args[1]);
    size_t[long] aa;
    for (uint i; i < n; ++i)
        aa[i] = i;
}


The Python timings are exactly the same:
 1_000_000   0.25 s
 1_000_000   0.41 s
 5_000_000   0.80 s
10_000_000   1.48 s
20_000_000   2.88 s

The Python code I have used:

import sys
def main(n):
    aa = {}
    for i in xrange(n):
        aa[i] = i
import psyco; psyco.bind(main)
main(int(sys.argv[1]))

Without Psyco timings are similar but a little higher. Disabling the garbage collector timings become a just a bit lower.


>The times you quote seem awfully fast, but the complexity of the issues is really beyond my understanding.<

Generally Python uses one or more dict access every time you use a variable, you read or write it, so they must be fast. In Python the sort (Timsort) and dicts/sets are optimized to death, they are now rather old, and not even Raymond Hettinger seem able to improve them :-)

You may want to take a look here again, there are some more questions and answers:
http://leonardo-m.livejournal.com/64284.html

Bye,
bearophile



More information about the Digitalmars-d mailing list