tooling quality and some random rant

Walter Bright newshound2 at digitalmars.com
Tue Feb 15 10:58:26 PST 2011


Don wrote:
> Walter Bright wrote:
>> Don wrote:
>>> In hand-coded asm, instruction scheduling still gives more than half 
>>> of the same benefit that it used to do. But, it's become ten times 
>>> more difficult. You have to use Agner Fog's manuals, not Intel/AMD.
>>>
>>> For example:
>>> (1) a common bottleneck on all Intel processors, is that you can only 
>>> read from three registers per cycle, but you can also read from any 
>>> register which has been modified in the last three cycles.
>>> (2) it's important to break dependency chains.
>>>
>>> On the BigInt code, instruction scheduling gave a speedup of ~40%.
>>
>> Wow. I didn't know that. Do any compilers currently schedule this stuff?
> 
> Intel probably does. I don't think any others do a very good job. Agner 
> told me that he had had no success in getting compiler vendors to be 
> interested in his work.

Well, this one is. In fact, could we get Agner to actively help us out with this?


>> Any chance you want to take a look at cgsched.c? I had great success 
>> using the same algorithm for the quite different Pentium and P6 
>> scheduling minutia.
> 
> That would really be fun.
> BTW, the current Intel processors are basically the same as Pentium Pro, 
> with a few improvements. The strange thing is, because of all of the 
> reordering that happens, swapping the order of two (non-dependent) 
> instructions makes no difference at all. So you always need to look at 
> every instruction in the a loop, before you can do any scheduling.

I was looking at Agner's document, and it looks like ordering the instructions 
in the 4-1-1 or 4-1-1-1 for optimal decoding could work. This would fit right in 
with the way the scheduler works.

I had thought that with the CPU automatically reordering instructions, that 
scheduling them was obsolete.


More information about the Digitalmars-d mailing list