Performance test of short-circuiting AliasSeq
Stefan Koch
uplink.coder at googlemail.com
Tue Jun 2 19:21:32 UTC 2020
On Tuesday, 2 June 2020 at 09:19:57 UTC, Stefan Koch wrote:
> On Tuesday, 2 June 2020 at 06:52:00 UTC, Walter Bright wrote:
>> On 6/1/2020 1:23 PM, Stefan Koch wrote:
>>> TLDR; Performance patch caused a slowdown.
>>> Why? Because the checking for the case which it wants to
>>> optimize takes more time than you safe by optimizing it.
>>
>> You'll need a finer grained profile to reach this conclusion
>> than the gross measurement of the compiler runtime. I suggest
>> using Vtune.
>
> Vtune doesn't do intrusive profiling as far as I know.
> Therefore it's measurements _can_ miss functions which are
> called often,
> but are very short running.
> A profile obtained by periodic sampling is has the potential
> of misrepresenting the situation.
>
> In this case finer grained analysis is of course needed to do a
> 100% definite statement, however the measurement itself is
> quite telling.
> Your short-circuiting patch was the only thing that changed
> between the versions.
>
> I do have a fully instrumented version of dmd.
> I know exactly which functions spends how much time processing
> which symbol.
>
> I will post those results soon.
So I have a little profiling tool in DMD which tells me how often
certain functions are being called.
This is the output for the test with the patch:
=== Phase Time Distribution : ===
phase
avgTime absTime freq
dmd.dsymbolsem.dsymbolSemantic
127270.86 37145399296 291861
dmd.dsymbolsem.templateInstanceSemantic
355060.25 17450856448 49149
dmd.expressionsem.symbolToExp
993.92 113991488 114689
dmd.dtemplate.TemplateInstance.syntaxCopy
1327.78 92444120 69623
dmd.dmangle.Mangler.mangleType
293.39 60637312 206680
dmd.dtemplate.TemplateDeclaration.findExistingInstance
950.36 19462416 20479
dmd.mtype.TypeIdentifier.syntaxCopy
203.80 12570752 61681
Type::mangleToBuffer
83.38 2107624 25277
dmd.dmangle.Mangler.mangleSymbol
573.35 108936 190
dmd.func.FuncDeclaration.functionSemantic
374.00 374 1
=== Phase Time Distribution : ===
phase
avgTime absTime freq
dmd.dsymbolsem.dsymbolSemantic
124654.58 36400631808 292012
dmd.dsymbolsem.templateInstanceSemantic
348265.00 17116876800 49149
dmd.expressionsem.symbolToExp
1013.76 116267384 114689
dmd.dtemplate.TemplateInstance.syntaxCopy
1317.11 91701088 69623
dmd.dmangle.Mangler.mangleType
288.24 59609648 206809
dmd.dtemplate.TemplateDeclaration.findExistingInstance
954.04 19537828 20479
dmd.mtype.TypeIdentifier.syntaxCopy
205.71 12690568 61693
Type::mangleToBuffer
91.08 2307294 25332
dmd.dmangle.Mangler.mangleSymbol
832.58 169014 203
dmd.func.FuncDeclaration.functionSemantic
680.00 680 1
As you can see the templateInstanceSemantic is called the same
number of times.
That would imply the optimization does never actually get
triggered.
More information about the Digitalmars-d
mailing list