Scope storage class
bearophile
bearophileHUGS at lycos.com
Wed Nov 26 15:19:38 PST 2008
Sergey Gromov:
> Remove -inline from your compiler options, and #2 compiles and runs
> faster in both D1 and D2 than #1.
> lazy seems to do something funny when -inline is in effect.
You are right, I have tested it on D1.
I think the codepad doesn't use -inline, that's why #2 works there.
#2 also uses less RAM on D1, for example N=24 requires about 707 MB instead of 788 MB, and about 2.8 s instead of about 3 s.
Sergey Gromov:
> Remove -inline from your compiler options, and #2 compiles and runs
> faster in both D1 and D2 than #1.
> lazy seems to do something funny when -inline is in effect.
You are right, I have tested it on D1.
I think the codepad doesn't use -inline, that's why #2 works there.
#2 also uses less RAM on D1, for example N=24 requires about 710 MB instead of 788 MB, and about 2.8 s instead of about 3 s.
This is the Asm of the #2 compiled with D1 with -O -release, it's shorter still (but note there are some other parts that I don't show here):
_D11man_or_boy21aFiLiLiLiLiLiZi comdat
assume CS:_D11man_or_boy21aFiLiLiLiLiLiZi
L0: push EAX
push EBX
cmp dword ptr 034h[ESP],0
jg L33
mov EAX,014h[ESP]
mov EDX,018h[ESP]
mov EBX,014h[ESP]
call EDX
push EAX
sub ESP,4
mov EAX,014h[ESP]
mov EDX,018h[ESP]
mov EBX,014h[ESP]
call EDX
mov ECX,EAX
add ESP,4
pop EAX
add EAX,ECX
jmp short L3C
L33: lea EAX,4[ESP]
call near ptr _D11man_or_boy21aFiLiLiLiLiLiZi1bMFZi
L3C: pop EBX
pop ECX
ret 02Ch
_D11man_or_boy21aFiLiLiLiLiLiZi ends
--------------------------
I have then tested #1 and #2 without -inline on D2, and the results are very different from each other: #1 is very slow and uses lot of memory, while #2 (that contains no scope) acts as D1, using "only" 707 MB with N=24 and working with n=25 too. The asm code is similar the one I have just shown here.
So compiling #2 witout -inline in D2 fulfulls my original desire of computing up to N=25 with D2 :-)
I presume the -inline uncovers a small bug of DMD, that will be fixed. But what interests me more now is to understand how to write such fast code in general in D2.
Bye,
bearophile
More information about the Digitalmars-d-announce
mailing list