dmd codegen improvements

Wed Aug 19 03:42:37 PDT 2015

On Wednesday, 19 August 2015 at 10:25:14 UTC, ponce wrote:
> Loops in video coding already have no conditional. And for the 
> one who have, conditionals were already removeable with 
> existing instructions.

I think you are side-stepping the issue. Most people don't write 
video codecs. Most people also don't want to hand optimize their 
inner loops. The typical and most likely scenario is to run some 
easy-to-read-but-suboptimal function over a dataset. You both 
need library and compiler support for that to work out.

But even then: 10% difference in CPU benchmarks is a disaster.

> I stand by what I know and measured: previously few things are 
> speed up by AVX-xxx. It almost always better investing this 
> time to optimize somewhere else.

AVX-512 is too far into the future, but if you are going to write 
a backend you have to think about increasing register sizes. Just 
because register size increase does not mean that throughput 
increases in the generation it was introduced (it could translate 
into several micro-ops).

But if you start redesigning your back end now then maybe you 
have something good in 5 years, so you need to plan ahead, not 
thinking about current gen, but 1-3 generations ahead.

Keep in mind that clock speeds are unlikely to increase, but 
stacking of memory on top of the CPU and getting improved memory 
bus speeds is a quite likely scenario.

A good use for the DMD backend would be to improve and redesign 
it for compile time evaluation. Then use LLVM for codegen.