[Optimization] Speculatively not calling invariant on class objects

Wed Aug 12 05:22:37 PDT 2015

This post got me thinking: 
http://forum.dlang.org/post/mpo71n$22ma$1@digitalmars.com

We know at compile time for a given object whether or not there 
are any invariants, lack of any polymorphism, along with 
disallowing invariants in interfaces means that for the given:

   class NoInvariants { }
   NoInvariants obj;
   assert(obj);

It's only a case of checking each base class for any invariant 
functions, and if none are found, then we can make an (almost) 
reasonable assumption that calling _d_invariant will result in 
nothing but wasted cycles.

However, these can't be omitted completely at compile-time given 
that we can't guarantee if there are any up-cast classes that 
have an invariant.

But we should be able to speculatively test at runtime whether or 
not a call to _d_invariant may be required by doing a simple 
pointer test on the classinfo.

So, given a scenario where we *know* that in a given method 
'func', the this class object NoInvariants provably has no 
invariants anywhere in it's vtable, we can turn calls to 
_d_invariant into.

   void func(NoInvariants this)
   {
     if (typeid(this) == typeid(NoInvariants))
     {
       /* Nothing */
     }
     else
     {
       _d_invariant(this);
     }
   }

A similar tactic is done in C++ when it comes to speculative 
de-virtualization. [1]

Giving this a try on some very contrived benchmarks:

   void test()
   {
       NoInv obj = new NoInv();
       obj.func();
   }
   auto bench = benchmark!(test)(10_000_000);
   writeln("Total time: ", to!Duration(bench[0]));

I found that the patched codegen actually managed to consistently 
squeeze out an extra 2% or more in runtime performance over just 
turning off invariants, and in tests where the check was made to 
fail, was pretty much a penalty-less in comparison to always 
calling _d_invariant.

always_inv(-O2 w/o patch):
- Total time: 592 ms, 430 μs, and 6 hnsecs

always_inv(final, -O2 w/o patch):
- Total time: 572 ms, 495 μs, and 1 hnsec

no_inv(-O2 -fno-invariants):
- Total time: 526 ms, 696 μs, and 3 hnsecs

no_inv(final, -O2 -fno-invariants):
- Total time: 514 ms, 477 μs, and 3 hnsecs

spec_inv(-O2 w/ patch):
- Total time: 513 ms, 90 μs, and 6 hnsecs

spec_inv(final, -O2 w/ patch)
- Total time: 503 ms, 343 μs, and 9 hnsecs

This surprised me, I would have thought that both no_inv and 
spec_inv would be the same, but then again maybe I'm just no good 
at writing tests (very likely).

I'm raising a PR [2], granted that no one can see a hole in my 
thought process, I'd be looking to get it merged in and let 
people try it out to see if they get a similar improvement 
general applications for in non-release builds.

Regards
Iain

[1]: 
http://hubicka.blogspot.de/2014/02/devirtualization-in-c-part-4-analyzing.html
[2]: https://github.com/D-Programming-GDC/GDC/pull/132