DMD 0.174 release

Thu Nov 16 19:39:47 PST 2006

Chris Nicholson-Sauls wrote:
> Chris Nicholson-Sauls wrote:
> 
>> Jarrett Billingsley wrote:
>>
>>> "Reiner Pope" <reiner.pope at REMOVE.THIS.gmail.com> wrote in message 
>>> news:ejg8qd$475$1 at digitaldaemon.com...
>>>
>>>> Jarrett Billingsley wrote:
>>>
>>>
>>>
>>>
>>>> I'm not sure if this is what you want, but here's a cast-free 'call' 
>>>> function:
>>>>
>>>> ...code...
>>>
>>>
>>>
>>>
>>> Ah!  That's an alright tradeoff.  I didn't actually know that you 
>>> could set the .ptr and .funcptr properties of delegates; I thought 
>>> they were read-only!  Cool.
>>>
>>
>> I didn't know this either... and it opens some ideas up to me...  
>> Like, perhaps one could do the following:
>>
>> # class Foo {
>> #   void bar () { ... }
>> # }
>> #
>> # Foo[] list = ... ;
>> # auto dg = &list[0].bar;
>> # dg();
>> # foreach (x; list[1 .. $]) {
>> #   dg.ptr = x;
>> #   dg();
>> # }
>>
>> Might actually be slightly cheaper than a normal call in some cases, 
>> particularly of Foo and/or bar() are final.  Will have to test it...
>>
>> -- Chris Nicholson-Sauls
> 
> 
> Well, I tested it.  And the results are... pretty neutral, really.  
> Which is a plus in its own right, because it does at least mean one can 
> use this trick without worry.  Some exemplar output from the test program:
> 
>                  Repeats:      100
>               Iterations:     1000
>                List size:      500
>             Virtual call:        0 sec /       20 msec /    21417 usec
>     Manipulated delegate:        0 sec /       18 msec /    19367 usec
> 
>                  Repeats:      100
>               Iterations:     5000
>                List size:      500
>             Virtual call:        0 sec /       86 msec /    87139 usec
>     Manipulated delegate:        0 sec /       86 msec /    86640 usec
> 
>                  Repeats:      100
>               Iterations:     5000
>                List size:       25
>             Virtual call:        0 sec /        7 msec /     8117 usec
>     Manipulated delegate:        0 sec /        6 msec /     6901 usec
> 
> So, yes, there is occasionally some speedup from using the manipulated 
> delegate, but its nothing to scream about.  And it appears that, as the 
> size of the data increases, or the number of iterations over it, the 
> times start to drift toward each other and even out. ("Repeats" in the 
> output is the number of times the test was run, with the results being 
> averaged out.)
> 
> -- Chris Nicholson-Sauls

Well... what do you know.  I get back home and take a look over the test again... and find 
that I actually had made an error.  The data wasn't getting reset between the two loops 
(forgot to call the reset() function... very duh moment)... which means the results for 
the delegate style were actually an average of its runs /plus/ the runs of the normal 
call!  So, I fixed it... and here are a couple of sample runs:

                  Repeats:      100
               Iterations:     1000
                List size:      500
             Virtual call:        0 sec /       21 msec /    21898 usec
     Manipulated delegate:        0 sec /       16 msec /    17367 usec

                  Repeats:      100
               Iterations:     5000
                List size:      500
             Virtual call:        0 sec /       97 msec /    98014 usec
     Manipulated delegate:        0 sec /       93 msec /    94004 usec

                  Repeats:      100
               Iterations:     5000
                List size:       25
             Virtual call:        0 sec /        8 msec /     9436 usec
     Manipulated delegate:        0 sec /        3 msec /     4211 usec

Definitely a more significant difference than I'd previously thought!  Still not a massive 
difference, and the results actually vary quite a bit between runs.  (Sometimes by several 
milliseconds.)  And still as the data set size or iterations grow, the times approach 
equality, sometimes within a couple hundred microseconds.  (Every so often, the normal 
calls when even perform a little faster.)

I suppose, if this could cleanly and optimally be generalized into a template, it /might/ 
provide some benefit to those wanting to eek out as much speed as possible.  Albeit with 
inconsistant benefits.

-- Chris Nicholson-Sauls