Need for speed
H. S. Teoh
hsteoh at quickfur.ath.cx
Thu Apr 1 19:38:39 UTC 2021
On Thu, Apr 01, 2021 at 04:52:17PM +0000, Nestor via Digitalmars-d-learn wrote:
[...]
> ```
> import std.stdio;
> import std.random;
> import std.datetime.stopwatch : benchmark, StopWatch, AutoStart;
> import std.algorithm;
>
> void main()
> {
> auto sw = StopWatch(AutoStart.no);
> sw.start();
> int[] mylist;
Since the length of the array is already known beforehand, you could get
significant speedups by preallocating the array:
int[] mylist = new int[100000];
for (int number ...)
{
...
mylist[number] = n;
}
> for (int number = 0; number < 100000; ++number)
> {
> auto rnd = Random(unpredictableSeed);
[...]
Don't reseed the RNG every loop iteration. (1) It's very inefficient and
slow, and (2) it actually makes it *less* random than if you seeded it
only once at the start of the program. Move this outside the loop, and
you should see some gains.
> auto n = uniform(0, 100, rnd);
> mylist ~= n;
> }
> mylist.sort();
> sw.stop();
> long msecs = sw.peek.total!"msecs";
> writefln("%s", msecs);
> }
[...]
> ```
Also, whenever performance matters, use gdc or ldc2 instead of dmd. Try
`ldc2 -O2`, for example.
I did a quick test with LDC, with a side-by-side comparison of your
original version and my improved version:
-------------
import std.stdio;
import std.random;
import std.datetime.stopwatch : benchmark, StopWatch, AutoStart;
import std.algorithm;
void original()
{
auto sw = StopWatch(AutoStart.no);
sw.start();
int[] mylist;
for (int number = 0; number < 100000; ++number)
{
auto rnd = Random(unpredictableSeed);
auto n = uniform(0, 100, rnd);
mylist ~= n;
}
mylist.sort();
sw.stop();
long msecs = sw.peek.total!"msecs";
writefln("%s", msecs);
}
void improved()
{
auto sw = StopWatch(AutoStart.no);
sw.start();
int[] mylist = new int[100000];
auto rnd = Random(unpredictableSeed);
for (int number = 0; number < 100000; ++number)
{
auto n = uniform(0, 100, rnd);
mylist[number] = n;
}
mylist.sort();
sw.stop();
long msecs = sw.peek.total!"msecs";
writefln("%s", msecs);
}
void main()
{
original();
improved();
}
-------------
Here's the typical output:
-------------
209
5
-------------
As you can see, that's a 40x improvement in speed. ;-)
Assuming that the ~209 msec on my PC corresponds with your observed
280ms, and assuming that the 40x improvement will also apply on your
machine, the improved version should run in about 9-10 msec. So this
*should* have give you a 4x speedup over the Python version, in theory.
I'd love to see how it actually measures on your machine, if you don't
mind. ;-)
T
--
Holding a grudge is like drinking poison and hoping the other person dies. -- seen on the 'Net
More information about the Digitalmars-d-learn
mailing list