Test for array literal arguments?

Peter Alexander peter.alexander.au at gmail.com
Wed Jun 6 08:54:35 PDT 2012


On Wednesday, 6 June 2012 at 01:44:24 UTC, bearophile wrote:
> Writing specialized versions without any language help is not 
> nice, and I think the gain is significant, it's not just tiny 
> optimizations. My D programs contain lot of stuff known at 
> compile-time. I think such simple poor's man hand-made version 
> of partial compilation is able to do things like (done by true 
> partial compilation):
>
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.5469&rep=rep1&type=pdf

Thanks for the link. The paper is interesting.

One interesting point to note is that they bloat the executable 
massively by specialising (30x larger in a few cases). 
Fortunately for them, their program is small enough to fit in the 
I-cache anyway, so it has no effect on performance, however, in a 
large program that size blowup would cause much more I-cache 
capacity misses, which would then negatively affect runtime.

This effect has already been seen in the games industry with 
std::sort. As you probably know, std::sort specialises the sort 
routine for the types being sorted, so you end up with N 
specialised sort routines for N different types and each one adds 
a few KB to the executable.

If I want to sort a small array (maybe just 8 integers or so) 
then using std::sort is often worse than qsort because std::sort 
takes up more of the I-cache. It is now common in games to use 
qsort (or a small type-safe equivalent) instead of std::sort for 
small sorts to reduce bloat and I-cache pollution.

See slide 14-15 of this DICE presentation: 
http://www.slideshare.net/DICEStudio/executable-bloat-how-it-happens-and-how-we-can-ght-it

I'm sure your D programs do contain a lot of stuff known at 
compile-time, but I'm also sure that in any non-trivial program, 
95% of your code is not performance sensitive and would be better 
to be small than fast. The sample ray-tracer is not 
representative of a real program.

I hold my position that this would be counterproductive 95% of 
the time. In the 5% that is highly performance sensitive, we can 
use metaprogramming techniques (in D, or using external codegen) 
to make a workable solution.


More information about the Digitalmars-d mailing list