Faster Virtual Method Dispatch
Craig Black
cblack at ara.com
Mon Apr 24 07:35:01 PDT 2006
"Dan" <Dan_member at pathlink.com> wrote in message
news:e2im9o$29gi$1 at digitaldaemon.com...
>
> Actually, having "if statements" is absolutely TERRIBLE for trying to
> improve
> branch prediction. By it's very nature, if statements are branches, so
> you are
> causing at least one branch misprediction anyways.
>
> Furthermore, pointers are not O(1) algorithms. What the computer is doing
> is
> performing a:
>
> LEA (reg) [(v_pointer)];
> J(cc)|JMP [(reg)];
Maybe you are just way over my head but I'm not quite following how this is
anything bug O(1).
> My understanding is that the v_pointer address is known at compile time,
> however
> it implies:
>
> that the v_table's cache has to be loaded every time a function is called
> that extra lea buggers instruction pairing tagging on several cycles
> the potential that the v_table is a cache miss (highly unlikely a page
> miss)
>
> Interestingly however, if one were to prepend some Self Modifying Code to
> the
> program that instead loaded the vtable into a set of addresses attached to
> the
> vtable, then one could achieve the benefits of static functions and most
> of the
> benefits of virtual functions at the cost of having an extra void*.sizeof
> for
> every function call, and the SMC code overhead.]
OK, now you are definitely over my head. :)
> The problem is that *some* functions would thereafter be virtual so that a
> single pointer could be used to point to any of a set of functions - a
> case not
> easily rendered with static functions (but still possible remembering that
> all
> code is just data)
>
> That said, I think the overhead involved in using virtual functions is
> quite
> reasonable and comparable to the overhead in using 'calling conventions'
> versus
> a internally-optimized system with primarily register arguments. It's 7-8
> clocks every function call with a little bit of cache overhead and the
> potential
> for a cache miss.
>
> Ultimately though, I think the GC causes the highest level of performance
> troubles on the benchmarks. I love it to death though and wish someone
> would
> get their hands dirty making it smaller and sexier.
Or (truly) optional. :) The notion that the GC is optional in D is really
misleading. Everything uses it. While you could certainly make it faster
at the expense of more complexity, I don't think that GC that scans the
stack testing for non-null pointers will ever provide the performance of C's
malloc and free. GC is simply performing much more work. I have heard many
arguments to the contrary, but I still don't see how it is possible that GC
could compete with malloc and free.
-Craig
More information about the Digitalmars-d
mailing list