Thoughts about D

Walter Bright newshound2 at digitalmars.com
Fri Dec 1 22:20:28 UTC 2017


On 12/1/2017 3:31 AM, Nicholas Wilson wrote:
> On Friday, 1 December 2017 at 11:07:32 UTC, Walter Bright wrote:
> Does DMD optimise for locality?

No. However, the much-despised Optlink does! It uses the trace.def output from 
the profiler to set the layout of functions, so that tightly coupled functions 
are co-located.

   https://digitalmars.com/ctg/trace.html

It's not even just cache locality - rarely used functions can be allocated to 
pages so they are never even loaded in from disk. (The executable files are 
demand loaded.) The speed improvement can be dramatic, especially on program 
startup times, and if the program does a lot of swapping. I don't know if the 
Linux linker can accept a script file telling it the function layout.

The downside is because it relies on runtime profile information, it is awkward 
to set up and needs a representative usage test case to drive it.

dmd could potentially use a static call graph to do a better-than-nothing stab 
at it, but it would only work on code supplied to it as a group on the command line.


> I would hope co-located functions are either larger than cache lines by a 
> reasonable amount or, if they are small enough, inlined so that the asserts can 
> be aggregated. It is also possible (though I can't comment on how easy it would 
> be to implement) if you are trying to optimise for co-location to have the 
> asserts be completely out of line so that you have
> 
> function1
> function2
> function3
> call asserts of function1
> call asserts of function2
> call asserts of function3
> 
> such that the calls to the asserts never appear in the icache at all apart from 
> overlap of e.g. function1's asserts after the end of function3, or one of the 
> the asserts fail.

It's possible, although the jmps to the assert code would now have to be 
unconditional relocatable jmps which are larger:

     jne L1
     jmp assertcode
L1:


> Then it becomes a tradeoff, one that I'm glad the compiler is doing for me.

Everything about codegen is a tradeoff :-)



More information about the Digitalmars-d mailing list