Performance of method calls
Daniel Keep
daniel.keep+lists at
Wed Nov 29 23:35:58 PST 2006
Bill Baxter wrote:
> Daniel Keep wrote:
>> Hi.
>> I'm currently working on a research project as part of my Honours, and
>> one of the requirements is speed--the code I'm writing has to be very
>> efficient.
>> Before I started, my supervisor warned me about object-oriented
>> programming and that it seems to be much slower than just flat
>> function calls.
>> Anyway, I was wondering what the difference between three kinds of
>> function calls would be:
>> 1. Foo x;; // Where Foo is a struct
>> 2. Foo_call(x); // C-style API
>> 3. auto y = new FooClass;; // Where call is final
>> I hooked up a test app which used a loop with 100,000 iterations for
>> each call type, and ran that program 100 times, and averaged the outputs.
>> #1 was 2.84 times slower than #2, and #3 was 3.15 times slower than
>> #2. Are those numbers right?? Is it really that much slower? I
>> would have thought that they should have been about the same since
>> each one needs to pass only one thing: a pointer. I've attached the
>> test programs I used; if anyone can offer any insight or even
>> corrections, I'd be very grateful.
>> Incidentally, any other insights regarding performance differences
>> between OO-style and flat procedural-style would be very welcome.
>> -- Daniel
>> ------------------------------------------------------------------------
>> #!/bin/bash
>> for I in {1..100}; do
>> ./struct_calls
>> done | awk '
>> BEGIN {sum1 = 0; sum2 = 0; count = 0;}
>> {sum1 += $5; sum2 += $11; count += 1;
>> print $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11;}
>> END {print "Averages: " sum1/count ", " sum2/count}
>> ' | tee struct_calls.log
>> ------------------------------------------------------------------------
>> module struct_calls;
>> import std.perf;
>> import std.stdio;
>> const COUNT = 100_000u;
>> struct Foo
>> {
>> uint dummy;
>> void call()
>> {
>> //this.dummy = arg;
>> }
>> }
>> class FooClass
>> {
>> uint dummy;
>> final void call()
>> {
>> //this.dummy = arg;
>> }
>> }
>> void Foo_call(Foo* _this)
>> {
>> //_this.dummy = arg;
>> }
>> void main()
>> {
>> Foo x;
>> scope perf = new HighPerformanceCounter();
>> perf.start();
>> for( uint i=0; i<COUNT; i++ )
>> perf.stop();
>> // count1:
>> auto count1 = perf.periodCount();
>> perf.start();
>> for( uint i=0; i<COUNT; i++ )
>> Foo_call(&x);
>> perf.stop();
>> // count2: Foo_call(&x)
>> auto count2 = perf.periodCount();
>> scope y = new FooClass();
>> perf.start();
>> for( uint i=0; i<COUNT; i++ )
>> perf.stop();
>> // count3:
>> auto count3 = perf.periodCount();
>> writefln("%d / %d = %f -- %d / %d = %f",
>> count1, count2, cast(real)count1 / cast(real)count2,
>> count3, count2, cast(real)count3 / cast(real)count2);
>> }
> A call to a final method shouldn't be any slower than straight funciton
> call. But I have heard grumblings that final in D doesn't actually work
> as spec'ed.
> But anyway the main point of this reply is that even if method call is
> 3x slower than function call, it really means nothing if function call
> overhead is 0.01% of your run time to begin with. There's a popular
> thing to teach in computer architecture classes called "Amdahl's Law",
> and basically to paraphrase it says don't fixate on optimizing things
> that represent trivial fractions of the overall runtime to begin with.
> For example if method calls represents 0.03% of your overall runtime the
> BEST speedup you could achieve by eliminating all calls entirely would
> be 0.03%. Pretty trivial. It's like trying to figure out how to
> improve driving times from home to work, and focusing on increasing your
> speed in your driveway. Even if you achieve a 100x speedup in how long
> you spend driving on your driveway, you've still only sped up the hour
> long drive by a second or two. Not worth the effort.
> In other words, for all but the most rare of cases, the benifits reaped
> in terms of code reability, maintainablity, and speed of development
> when using virtual functions vastly outweighs the tiny cost.
> What you should do is put something representative of the work you
> actually plan to do inside those functions in your test program and see
> if the 3x difference in call speed actually makes any significant
> difference.
> --bb
I'm well aware that the time it takes to call into the function and pass
back out may be a very small amount of the overall time, but when
there's a 2-3x difference between supposedly identical code, that puts
up red warning flags in my head. If this is so much slower, what other
things are slower?
Also, normally I would agree that the benefits of "elegant" code
outweigh the performance drop. But not in this case. I'm not sure if I
can go into specifics, but the system will potentially be simulating
millions of individual agents across a networked cluster, with the
eventual aim to become faster than realtime.
As for putting code representative of what it'll be doing... I can't at
this stage since I'm still building the utility libraries. That
representative code doesn't exist yet.
I don't want to get bogged down in premature optimisation, but I also
want to eliminate any potential bottlenecks early if it isn't too much
In any case, I just want to understand where the difference is coming
from. If I understand the difference, I can hopefully make better
decisions. My supervisor originally wanted the code written in C, and I
want to prove that D is more than up to the task.
-- Daniel
P.S. To give you an idea of how badly I want to use D: I ported CWEB
over to support D :)
More information about the Digitalmars-d
mailing list