tooling quality and some random rant

Don nospam at nospam.com
Mon Feb 14 17:52:50 PST 2011


Walter Bright wrote:
> retard wrote:
>  > There are no arch specific optimizations for PIII, Pentium 4, Pentium D,
> Core, Core 2, Core i7, Core i7 2600K, and similar kinds of products from
> AMD.
> 
> The optimal instruction sequences varied dramatically on those earlier 
> processors, but not so much at all on the later ones. Reading the latest 
> Intel/AMD instruction set references doesn't even provide that 
> information anymore.
> 
> In particular, instruction scheduling no longer seems to matter, except 
> for the Intel Atom, which benefits very much from Pentium style 
> instruction scheduling. Ironically, dmc++ is the only available current 
> compiler which supports that.

In hand-coded asm, instruction scheduling still gives more than half of 
the same benefit that it used to do. But, it's become ten times more 
difficult. You have to use Agner Fog's manuals, not Intel/AMD.

For example:
(1) a common bottleneck on all Intel processors, is that you can only 
read from three registers per cycle, but you can also read from any 
register which has been modified in the last three cycles.
(2) it's important to break dependency chains.

On the BigInt code, instruction scheduling gave a speedup of ~40%.

But still, cache effects are more important than instruction scheduling 
in 99% of cases.

>> No mention of auto-vectorization 
> 
> dmc doesn't do auto-vectorization. I agree that's an issue.

> 
> 
>  > or whole program
> 
> I looked into that, there's not a lot of oil in that well.
> 
> 
>  > and instruction level optimizations the very latest GCC and LLVM are 
> now slowly adopting.
> 
> Huh? Every compiler in existence has done, and always has done, 
> instruction level optimizations.
> 
> 
> Note: a lot of modern compilers expend tremendous effort optimizing 
> access to global variables (often screwing up multithreaded code in the 
> process). I've always viewed this as a crock, since modern programming 
> style eschews globals as much as possible.


More information about the Digitalmars-d mailing list