A modest proposal: eliminate template code bloat

Mon Apr 9 03:27:19 PDT 2012

On 04/09/12 08:21, Somedude wrote:
> Le 08/04/2012 16:18, H. S. Teoh a écrit :
>> On Sun, Apr 08, 2012 at 03:01:56PM +0400, Dmitry Olshansky wrote:
>>> I think it's been ages since I meant to ask why nobody (as in
>>> compiler vendors) does what I think is rather simple optimization.
>>>
>>> In the short term the plan is to introduce a "link-time" flavored
>>> optimization at code generation or (better) link step.
>>
>> This would be incompatible with how current (non-dmd) linkers work. But
>> I do like the idea. Perhaps if it works well, other linkers will adopt
>> it? (Just like how the gcc linker adopted duplicate template code
>> elimination due to C++ templates.)
>>
>> T
>>
> 
> Actually, in C++ (as well as D), the added benefit would be a greatly
> improved compilation speed, wouldn't it ?
> I bet if the idea works in D and proves increased compilation, compiler
> writers would be very compelled to implement it in C++.
> 

They already do.

It's a very simple and trivial optimization, the question is only about
programmer expectations. Every (memory) object having an unique address
*is* a valuable feature with clear benefits. (C++ has functions as
non-objects, that's why the compilers can get away with the optimization)
Note that that does not actually mean that everything has to be placed
at an unique address -- it only needs to behave *AS IF*, as long as the
program can't tell the difference.

On 04/09/12 02:59, Daniel Murphy wrote:
> "Artur Skawina" <art.08.09 at gmail.com> wrote in message 
> news:mailman.1480.1333900846.4860.digitalmars-d at puremagic.com...
>>
>> Note that my point is just that the compiler needs to emit a dummy
>> so that the addresses remain unique, eg
>>
>>   module.f!uint:
>>       jmp module.f!int
> 
> Or use a nop slide before the start of the function.  Since we're modifying 
> the object file format anyway, it would be trivial for the compiler to mark 
> functions which have their address taken as needing a unique address. 

Nice idea. Given todays amounts of alignment noops emitted it would usually
be completely free.

But I now think the optimization would be ok, and should even on by default
for the case where the identical code sequence was generated from an
identical token sequence. That would handle the template bloat issue while
avoiding most of the problems; having non-unique addresses for this case
should be harmless and would just need to be properly documented.

It's only the random-completely-unrelated-function-replacement that is 
problematic - think such functions randomly appearing in the call chain,
confusing both downstream code and programmers looking at backtraces or
perf profiles, and breakpoints that magically appear out of nowhere at random.

artur