[Optimization] Speculatively not calling invariant on class objects
Iain Buclaw via Digitalmars-d
digitalmars-d at puremagic.com
Wed Aug 12 05:22:37 PDT 2015
This post got me thinking:
http://forum.dlang.org/post/mpo71n$22ma$1@digitalmars.com
We know at compile time for a given object whether or not there
are any invariants, lack of any polymorphism, along with
disallowing invariants in interfaces means that for the given:
class NoInvariants { }
NoInvariants obj;
assert(obj);
It's only a case of checking each base class for any invariant
functions, and if none are found, then we can make an (almost)
reasonable assumption that calling _d_invariant will result in
nothing but wasted cycles.
However, these can't be omitted completely at compile-time given
that we can't guarantee if there are any up-cast classes that
have an invariant.
But we should be able to speculatively test at runtime whether or
not a call to _d_invariant may be required by doing a simple
pointer test on the classinfo.
So, given a scenario where we *know* that in a given method
'func', the this class object NoInvariants provably has no
invariants anywhere in it's vtable, we can turn calls to
_d_invariant into.
void func(NoInvariants this)
{
if (typeid(this) == typeid(NoInvariants))
{
/* Nothing */
}
else
{
_d_invariant(this);
}
}
A similar tactic is done in C++ when it comes to speculative
de-virtualization. [1]
Giving this a try on some very contrived benchmarks:
void test()
{
NoInv obj = new NoInv();
obj.func();
}
auto bench = benchmark!(test)(10_000_000);
writeln("Total time: ", to!Duration(bench[0]));
I found that the patched codegen actually managed to consistently
squeeze out an extra 2% or more in runtime performance over just
turning off invariants, and in tests where the check was made to
fail, was pretty much a penalty-less in comparison to always
calling _d_invariant.
always_inv(-O2 w/o patch):
- Total time: 592 ms, 430 μs, and 6 hnsecs
always_inv(final, -O2 w/o patch):
- Total time: 572 ms, 495 μs, and 1 hnsec
no_inv(-O2 -fno-invariants):
- Total time: 526 ms, 696 μs, and 3 hnsecs
no_inv(final, -O2 -fno-invariants):
- Total time: 514 ms, 477 μs, and 3 hnsecs
spec_inv(-O2 w/ patch):
- Total time: 513 ms, 90 μs, and 6 hnsecs
spec_inv(final, -O2 w/ patch)
- Total time: 503 ms, 343 μs, and 9 hnsecs
This surprised me, I would have thought that both no_inv and
spec_inv would be the same, but then again maybe I'm just no good
at writing tests (very likely).
I'm raising a PR [2], granted that no one can see a hole in my
thought process, I'd be looking to get it merged in and let
people try it out to see if they get a similar improvement
general applications for in non-release builds.
Regards
Iain
[1]:
http://hubicka.blogspot.de/2014/02/devirtualization-in-c-part-4-analyzing.html
[2]: https://github.com/D-Programming-GDC/GDC/pull/132
More information about the Digitalmars-d
mailing list