dmd optimizer now converted to D!

Wed Jul 4 17:22:22 UTC 2018

On Thu, Jul 05, 2018 at 04:55:09AM +1200, rikki cattermole via Digitalmars-d wrote:
> On 05/07/2018 4:06 AM, jmh530 wrote:
> > On Tuesday, 3 July 2018 at 23:05:00 UTC, rikki cattermole wrote:
> > > 
> > > On that note, I have a little experiment that I'd like to see
> > > done.  How would the codegen change, if you were to triple the
> > > time the optimizer had to run?
> > 
> > Would it make any difference to compile DMD with LDC?
> 
> We already know the answer to this, and the answer is yes. Dmd does
> run faster. But that isn't what I'm interested in.
> 
> What I want to know is if dmd will produce better code if you give the
> optimizer longer time to run. Because right now that is the limiting
> factor.
[...]

Actually, what will make dmd produce better code IMO is: (1) a more
aggressive metric for the inliner (currently it gives up too easily, at
the slightest increase in code complexity), and (2) implement loop
unrolling.

Both are pretty big factors because of the domino-effect in
optimization: inlining a function opens up opportunities for refactoring
wrt the surrounding code, which may yield simplified code that can be
further optimized.  Similarly, (possibly speculative) loop unrolling may
produce simplified code wrt the surrounding context, thus revealing more
loop optimization opportunities. In turn, these opportunities may lead
to more optimization opportunities.

Giving up too early on either front means you miss the first step in
this chain of successive optimizations, so you lose the whole chain.

I came to this conclusion after looking at disassembly comparisons
between dmd and gdc/ldc over several of my projects.  At first I thought
that the dmd optimizer doesn't implement loop optimizations, but it
turns out to be false; dmd *is* capable of things like strength
reduction and code lifting, but as Walter himself has said, it does
*not* implement loop unrolling. Comparing with gdc's output, for
example, it's pretty clear to me that the lack of unrolling causes
further optimization opportunities to be missed.  Ditto with inlining --
gdc's inliner, for example, is far more aggressive and inlines a lot
more things, whereas dmd's inliner gives up earlier.  While for simple
code this may actually be better, for more complex code (and most
importantly, for range-based code), it causes missed optimization
opportunities down the road.

If we can nail down these two things, I think dmd's codegen quality
should improve significantly.

T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi