More embarrassing microbenchmars for D's GC.

Tue Jun 10 14:13:16 PDT 2008

Leandro Lucarella wrote:
> Walter Bright, el  9 de junio a las 17:12 me escribiste:
>> Leandro Lucarella wrote:
>>> But there are a few other results I can't explain:
>>> 1) Why is D gc (disabled or not) version ~25% slower than the D version
>>>   that uses malloc when iterating the list? It shouldn't be any GC
>>>   activity in that part. Could be some GC locallity issue that yields
>>>   more cache misses?
>> There are so many factors that can influence this, it's hard to say without 
>> spending a lot of time carefully examining it.
> 
> But there is no GC activity there right?

Instrument it to be sure.

> 
>>> 2) Why is D malloc version ~33% slower than the C version? I took a look
>>>   at the generated assembly and it's almost identical:
>>> 	<_Dmain+198>:   lea    -0x20(%ebp),%eax
>>> 	<_Dmain+201>:   lea    0x0(%esi,%eiz,1),%esi
>>> 	<_Dmain+208>:   addl   $0x1,0x8(%eax)
>>> 	<_Dmain+212>:   adcl   $0x0,0xc(%eax)
>>> 	<_Dmain+216>:   mov    (%eax),%eax
>>> 	<_Dmain+218>:   test   %eax,%eax
>>> 	<_Dmain+220>:   jne    0x804a240 <_Dmain+208>
>>> 	<main+248>:     lea    -0x1c(%ebp),%eax
>>> 	<main+251>:     nop
>>> 	<main+252>:     lea    0x0(%esi,%eiz,1),%esi
>>> 	<main+256>:     addl   $0x1,0x4(%eax)
>>> 	<main+260>:     adcl   $0x0,0x8(%eax)
>>> 	<main+264>:     mov    (%eax),%eax
>>> 	<main+266>:     test   %eax,%eax
>>> 	<main+268>:     jne    0x8048550 <main+256>
>>> 	<main+270>:     movl   $0x0,0x4(%esp)
>>> 	<main+278>:     movl   $0x8049800,(%esp)
>>> Tests attached.
>> Without running a profiler, there's no way to be sure about just where in the 
>> code the time is being consumed.
> 
> Attached is the output of the gprof program (for the list-test-d-gc with
> the GC running). I don't see any helpful information for the point 2), but
> it shows clearly that most of the time is spent in garbage collection.

Break up your code into more functions to get better info from the profiler.