Compiler optimizations

Fri Dec 19 17:06:11 PST 2008

On Fri, 19 Dec 2008 03:48:44 -0800, Bill Baxter <wbaxter at gmail.com> wrote:

> On Fri, Dec 19, 2008 at 8:16 PM, Sergey Gromov <snake.scaly at gmail.com>  
> wrote:
>> Fri, 19 Dec 2008 03:56:37 +0000 (UTC), dsimcha wrote:
>>
>>> Does anyone know of a decent guide that has information on what types  
>>> of
>>> optimizations compilers typically perform and what they aren't capable  
>>> of
>>> performing automatically?  I figure it would be useful to know  
>>> something like
>>> this because, when micro-optimizing performance-critical code, it's  
>>> silly to
>>> do a low-level optimization that makes the code less readable if the
>>> compiler's already probably doing it.  On the other hand, it would be  
>>> nice to
>>> know exactly what optimizations (besides the obvious stuff like  
>>> high-level
>>> design and algorithm optimizations) the compiler can't possibly be
>>> sufficiently smart to do, so I can spend time looking for  
>>> opportunities to do
>>> those.
>>
>> I'm afraid that your only option is to look at the assembly output and
>> then trick the compiler into making it better.  Very subtle  changes
>> make difference sometimes and I doubt you can pick this knowledge from
>> any manual.
>
> Sure but there are some things that aren't so subtle that the
> optimizer probably gets right all the time, or some that it gets right
> none of the time.
>
> For instance if I need to divide more than one number by the same
> thing I usually do
>    auto invx = 1.0/x;
>    first *= invx;
>    second *= invx;
>
> It would be nice to know that I was wasting my time and could just
> write  first/=invx; second/=invx and it would be optimized to one
> divide.

I haven't seen I compiler yet to do that optimization and DMD 1.038  
doesn't:
foreach(ref val; array)  val /= x;
is ten times slower than
foreach(ref val; array)  val *= invx;
on a 100,000 element array.

> Another thing I do a lot is worry about whether to introduce a new
> variable just to make the naming of a variable more accurate.  Like
> doing this:
>
>     float sum = 0; foreach(x; nums) sum+=x;  float avg = sum /  
> nums.length;
>
> Instead of the slightly less explicit, but potentially less stack using:
>
>     float avg = 0; foreach(x; nums) avg+=x;  avg/=nums.length;

> I bet the compiler is smart enough to eliminate 'avg' if I don't use
> 'sum' afterwards, but I'm not sure.

Most compilers do this with single variables, but if avg was a struct then  
the later is better than the former on at least one compiler I used (nvcc,  
an Open64 derivative) and the difference is important on GPUs at least.

> Anyway I don't think it makes
> enough difference to my program's speed to be worth investigating.
> But still it would be nice to know whether or not *in general* that
> little micro-optimization is useless or not.
>
> --bb

My general rule of thumb is than the optimizer is pretty good within a  
single code block and less so across blocks. And there's really no  
substitute for profiling and testing the user-optimization to see if it's  
actually worthwhile.