Faster Virtual Method Dispatch

Craig Black cblack at ara.com
Mon Apr 24 07:35:01 PDT 2006


"Dan" <Dan_member at pathlink.com> wrote in message 
news:e2im9o$29gi$1 at digitaldaemon.com...
>
> Actually, having "if statements" is absolutely TERRIBLE for trying to 
> improve
> branch prediction.  By it's very nature, if statements are branches, so 
> you are
> causing at least one branch misprediction anyways.
>
> Furthermore, pointers are not O(1) algorithms.  What the computer is doing 
> is
> performing a:
>
> LEA (reg) [(v_pointer)];
> J(cc)|JMP [(reg)];

Maybe you are just way over my head but I'm not quite following how this is 
anything bug O(1).

> My understanding is that the v_pointer address is known at compile time, 
> however
> it implies:
>
> that the v_table's cache has to be loaded every time a function is called
> that extra lea buggers instruction pairing tagging on several cycles
> the potential that the v_table is a cache miss (highly unlikely a page 
> miss)
>
> Interestingly however, if one were to prepend some Self Modifying Code to 
> the
> program that instead loaded the vtable into a set of addresses attached to 
> the
> vtable, then one could achieve the benefits of static functions and most 
> of the
> benefits of virtual functions at the cost of having an extra void*.sizeof 
> for
> every function call, and the SMC code overhead.]

OK, now you are definitely over my head. :)

> The problem is that *some* functions would thereafter be virtual so that a
> single pointer could be used to point to any of a set of functions - a 
> case not
> easily rendered with static functions (but still possible remembering that 
> all
> code is just data)
>
> That said, I think the overhead involved in using virtual functions is 
> quite
> reasonable and comparable to the overhead in using 'calling conventions' 
> versus
> a internally-optimized system with primarily register arguments.  It's 7-8
> clocks every function call with a little bit of cache overhead and the 
> potential
> for a cache miss.
>
> Ultimately though, I think the GC causes the highest level of performance
> troubles on the benchmarks.  I love it to death though and wish someone 
> would
> get their hands dirty making it smaller and sexier.

Or (truly) optional. :)  The notion that the GC is optional in D is really 
misleading.  Everything uses it.  While you could certainly make it faster 
at the expense of more complexity, I don't think that GC that scans the 
stack testing for non-null pointers will ever provide the performance of C's 
malloc and free.  GC is simply performing much more work.  I have heard many 
arguments to the contrary, but I still don't see how it is possible that GC 
could compete with malloc and free.

-Craig 





More information about the Digitalmars-d mailing list