Naive node.js faster than naive LDC2?

H. S. Teoh hsteoh at quickfur.ath.cx
Sat Aug 22 00:10:43 UTC 2020


On Fri, Aug 21, 2020 at 04:49:44PM -0700, H. S. Teoh via Digitalmars-d wrote:
[...]
> Using a class for Complex (and a non-final one at that!!) introduces
> tons of allocation overhead per iteration, plus virtual function call
> overhead.  You should be using a struct instead.  I betcha this one
> change will make a big difference in performance.
[...]

OK, so I copied the code and changed the class to struct, and compared
the results. Both versions are compiled with ldc2 -O3.

	class version:
	7 secs, 125 ms, 608 μs, and 9 hnsecs
	7 secs, 155 ms, 328 μs, and 6 hnsecs
	7 secs, 158 ms, 966 μs, and 4 hnsecs

	struct version:
	6 secs, 55 ms, 140 μs, and 4 hnsecs
	6 secs, 125 ms, 974 μs, and 5 hnsecs
	6 secs, 126 ms, 945 μs, and 4 hnsecs

For performance comparisons, take the best of n (because the others are
merely measuring more system noise).  This represents about a 15%
performance increase in switching to struct instead of class.

I thought it might make a difference to optimize for my CPU with
-mcpu=native, so here are the numbers:

	class version:
	7 secs, 100 ms, 602 μs, and 6 hnsecs
	7 secs, 100 ms, 437 μs, and 7 hnsecs
	7 secs, 121 ms, 594 μs, and 4 hnsecs

	struct version:
	6 secs, 73 ms, 534 μs, and 3 hnsecs
	5 secs, 662 ms, 626 μs, and 5 hnsecs
	6 secs, 103 ms, 871 μs, and 2 hnsecs

Again taking the best of 3, that's about a 20% performance increase
between changing from class to struct.

//

Just for laughs, I tested with dmd -O -inline:

	class version:
	7 secs, 255 ms, 748 μs, and 5 hnsecs
	7 secs, 249 ms, 683 μs, and 9 hnsecs
	7 secs, 593 ms, 847 μs, and 8 hnsecs

	struct version:
	7 secs, 646 ms, 685 μs, and 5 hnsecs
	7 secs, 618 ms, 642 μs, and 7 hnsecs
	7 secs, 606 ms, 85 μs, and 4 hnsecs

Surprisingly, the class version does *better* than the struct version
when compiled with dmd.  (Wow, is dmd codegen *that* bad that it
outweighs even class allocation overhead?? :-D)  But both are worse than
even the class version with ldc2 -O3 (even without -mcpu=native).

So yeah.  I wouldn't trust dmd with a 10-foot pole when it comes to
runtime performance.  The struct version compiled with `ldc2 -O3
-mcpu=native` beats the struct version compiled with dmd by a 26%
margin.  That's pretty sad.


T

-- 
An imaginary friend squared is a real enemy.


More information about the Digitalmars-d mailing list