Scope storage class

bearophile bearophileHUGS at lycos.com
Wed Nov 26 15:19:38 PST 2008


Sergey Gromov:
> Remove -inline from your compiler options, and #2 compiles and runs
> faster in both D1 and D2 than #1.
> lazy seems to do something funny when -inline is in effect.

You are right, I have tested it on D1.
I think the codepad doesn't use -inline, that's why #2 works there.
#2 also uses less RAM on D1, for example N=24 requires about 707 MB instead of 788 MB, and about 2.8 s instead of about 3 s.

Sergey Gromov:
> Remove -inline from your compiler options, and #2 compiles and runs
> faster in both D1 and D2 than #1.
> lazy seems to do something funny when -inline is in effect.

You are right, I have tested it on D1.
I think the codepad doesn't use -inline, that's why #2 works there.

#2 also uses less RAM on D1, for example N=24 requires about 710 MB instead of 788 MB, and about 2.8 s instead of about 3 s.

This is the Asm of the #2 compiled with D1 with -O -release, it's shorter still (but note there are some other parts that I don't show here):

_D11man_or_boy21aFiLiLiLiLiLiZi	comdat
	assume	CS:_D11man_or_boy21aFiLiLiLiLiLiZi
L0:		push	EAX
		push	EBX
		cmp	dword ptr 034h[ESP],0
		jg	L33
		mov	EAX,014h[ESP]
		mov	EDX,018h[ESP]
		mov	EBX,014h[ESP]
		call	EDX
		push	EAX
		sub	ESP,4
		mov	EAX,014h[ESP]
		mov	EDX,018h[ESP]
		mov	EBX,014h[ESP]
		call	EDX
		mov	ECX,EAX
		add	ESP,4
		pop	EAX
		add	EAX,ECX
		jmp short	L3C
L33:		lea	EAX,4[ESP]
		call	near ptr _D11man_or_boy21aFiLiLiLiLiLiZi1bMFZi
L3C:		pop	EBX
		pop	ECX
		ret	02Ch
_D11man_or_boy21aFiLiLiLiLiLiZi	ends

--------------------------

I have then tested #1 and #2 without -inline on D2, and the results are very different from each other: #1 is very slow and uses lot of memory, while #2 (that contains no scope) acts as D1, using "only" 707 MB with N=24 and working with n=25 too. The asm code is similar the one I have just shown here.
So compiling #2 witout -inline in D2 fulfulls my original desire of computing up to N=25 with D2 :-)

I presume the -inline uncovers a small bug of DMD, that will be fixed. But what interests me more now is to understand how to write such fast code in general in D2.

Bye,
bearophile


More information about the Digitalmars-d-announce mailing list