I wonder how fast we'd do

KnightMare black80 at bk.ru
Tue May 28 10:23:40 UTC 2019


On Tuesday, 28 May 2019 at 04:38:32 UTC, Andrei Alexandrescu 
wrote:
> https://jackmott.github.io/programming/2016/07/22/making-obvious-fast.html

totally for:
https://pastebin.com/j0T0MRmA small changes to Marco de Wild code
Windows Server 2019, i7-3615QM, DMD 2.086.0, LDC 1.16.0-b1

C:\content\downloadz\dlang>ldc2 -release -O3 -mattr=avx times2.d
C:\content\downloadz\dlang>times2.exe
t1=42 ms, 714 ╬╝s, and 9 hnsecs         r=10922666154674544967680
t2=42 ms and 614 ╬╝s                    r=10922666154674544967680
t3=0 hnsecs     r=0
t4=42 ms, 474 ╬╝s, and 8 hnsecs         r=10922666154674544967680

C:\content\downloadz\dlang>dmd -release -O -mcpu=avx times2.d
C:\content\downloadz\dlang>times2.exe
t1=141 ms, 263 ╬╝s, and 5 hnsecs        r=10922666154673907433000
t2=143 ms, 128 ╬╝s, and 9 hnsecs        r=10922666154673907433000
t3=1 hnsec      r=0
t4=491 ms, 829 ╬╝s, and 9 hnsecs        r=10922666154673907433000

1) different sums DMD and LDC (probably fast-math, dont know)
2) t3=0 for d_with_sum. lets see assembler for LDC (-output-s):
	.def	 _D6times210d_with_sumFNaNbNfAdZd;
	.scl	2;
	.type	32;
	.endef
	.section	.text,"xr",discard,_D6times210d_with_sumFNaNbNfAdZd
	.globl	_D6times210d_with_sumFNaNbNfAdZd
	.p2align	4, 0x90
_D6times210d_with_sumFNaNbNfAdZd:
	vxorps	%xmm0, %xmm0, %xmm0
	retq // this means "return 0"? cool optimization
3) for Windows better change "╬╝s" to "us" (when 
/SUBSYSTEM:CONSOLE)



More information about the Digitalmars-d mailing list