Need for speed

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Apr 1 19:38:39 UTC 2021


On Thu, Apr 01, 2021 at 04:52:17PM +0000, Nestor via Digitalmars-d-learn wrote:
[...]
> ```
> import std.stdio;
> import std.random;
> import std.datetime.stopwatch : benchmark, StopWatch, AutoStart;
> import std.algorithm;
> 
> void main()
> {
>     auto sw = StopWatch(AutoStart.no);
>     sw.start();
>     int[] mylist;

Since the length of the array is already known beforehand, you could get
significant speedups by preallocating the array:

	int[] mylist = new int[100000];
	for (int number ...)
	{
		...
		mylist[number] = n;
	}


>     for (int number = 0; number < 100000; ++number)
>     {
>         auto rnd = Random(unpredictableSeed);
[...]

Don't reseed the RNG every loop iteration. (1) It's very inefficient and
slow, and (2) it actually makes it *less* random than if you seeded it
only once at the start of the program.  Move this outside the loop, and
you should see some gains.


>         auto n = uniform(0, 100, rnd);
>         mylist ~= n;
>     }
>     mylist.sort();
>     sw.stop();
>     long msecs = sw.peek.total!"msecs";
>     writefln("%s", msecs);
> }
[...]
> ```

Also, whenever performance matters, use gdc or ldc2 instead of dmd. Try
`ldc2 -O2`, for example.


I did a quick test with LDC, with a side-by-side comparison of your
original version and my improved version:

-------------
import std.stdio;
import std.random;
import std.datetime.stopwatch : benchmark, StopWatch, AutoStart;
import std.algorithm;

void original()
{
    auto sw = StopWatch(AutoStart.no);
    sw.start();
    int[] mylist;
    for (int number = 0; number < 100000; ++number)
    {
        auto rnd = Random(unpredictableSeed);
        auto n = uniform(0, 100, rnd);
        mylist ~= n;
    }
    mylist.sort();
    sw.stop();
    long msecs = sw.peek.total!"msecs";
    writefln("%s", msecs);
}

void improved()
{
    auto sw = StopWatch(AutoStart.no);
    sw.start();
    int[] mylist = new int[100000];
    auto rnd = Random(unpredictableSeed);
    for (int number = 0; number < 100000; ++number)
    {
        auto n = uniform(0, 100, rnd);
        mylist[number] = n;
    }
    mylist.sort();
    sw.stop();
    long msecs = sw.peek.total!"msecs";
    writefln("%s", msecs);
}

void main()
{
    original();
    improved();
}
-------------


Here's the typical output:
-------------
209
5
-------------

As you can see, that's a 40x improvement in speed. ;-)

Assuming that the ~209 msec on my PC corresponds with your observed
280ms, and assuming that the 40x improvement will also apply on your
machine, the improved version should run in about 9-10 msec.  So this
*should* have give you a 4x speedup over the Python version, in theory.
I'd love to see how it actually measures on your machine, if you don't
mind. ;-)


T

-- 
Holding a grudge is like drinking poison and hoping the other person dies. -- seen on the 'Net


More information about the Digitalmars-d-learn mailing list