A modest proposal: eliminate template code bloat
Marco Leise
Marco.Leise at gmx.de
Sun Apr 8 05:37:14 PDT 2012
Am Sun, 08 Apr 2012 15:01:56 +0400
schrieb Dmitry Olshansky <dmitry.olsh at gmail.com>:
> I think it's been ages since I meant to ask why nobody (as in compiler
> vendors) does what I think is rather simple optimization.
>
> In the short term the plan is to introduce a "link-time" flavored
> optimization at code generation or (better) link step.
>
> For simplicity let's assume compiler does all of the following during
> generation of an object file.
>
> 1. Every time a function is generated (or pretty much any symbol) not
> only a size calculated but also a checksum* of it's data.
> (If we go for link-time optimization we should find a place to stick it
> to in the object file)
>
> 2. Compiler maintains a hash-table of symbol_size ---> list( ~ array) of
> pairs (references to data, checksum) of all symbols with given size.
> Call it a duplicate table. Every function generated and more generally
> global immutable data should end up there.
>
> 3. After any function was generated compiler checks an entry in the
> duplicate table that matches size, followed by matching checksum and
> only then (if required) doing a straight memcmp. If it happens that
> there is a match compiler just throws generated code away and _aliases_
> it's symbol to that of a matched entry.
> (so there has to be an alias table if there isn't one already)
>
> *Yes, checksum. I think it should be real simple and easy to parallel
> hash function. The original checksum is no reliable but some amount
> balancing and profiling are obviously required when picking this function.
>
> Applicability:
> It's not only const-immutable bloat, it can be alleviated with inout.
> Yet there are plenty of places the exact same code is being generated:
> e.g. sealed containers of int vs uint, std.find on dchar[] vs
> uint[]/int[] an so on.
> In general, the coarse grained parametrization is the root of all evil
> and it is inevitable since we are just humans after all.
>
> Notes:
> 1. If we do checksum calculation on the fly during codegen it gets at
> virtually no cost as the data is in CPU data cache. Preliminary version
> can avoid hacking this part of backend though.
>
> 2. By _alias_ I mean the ability of compiler to emit references to a
> given symbol as if it was some other symbol (should be really straight
> forward).
>
> 3. Linker have more data and is able to achieve colossal size savings,
> essentially running through the same algorithm before actually linking
> things. Again it's easily separable step and can be an opt-in.
>
> 4. Ironically the same exact thing works with any kind of immutable data
> structures. It looks like string pooling is superseded by this proposal.
>
> Thoughts?
Thoughts? Nothing much. I thought of that a while ago, but as an external program, that finds function calls by disassembling and removing dead/duplicate code. So I agree with you. A similar feature is a CTFE cache (or general code cache) that checksums a function's source code and gets the compiled version from a cache.
Template bloat could be especially important to 'fix' on embedded systems. But I don't consider it important enough at the moment. :/ Let's wait till the bugs and important features are implemented or hack the compiler ourselves.
--
Marco
More information about the Digitalmars-d
mailing list