tooling quality and some random rant
Walter Bright
newshound2 at digitalmars.com
Tue Feb 15 10:58:26 PST 2011
Don wrote:
> Walter Bright wrote:
>> Don wrote:
>>> In hand-coded asm, instruction scheduling still gives more than half
>>> of the same benefit that it used to do. But, it's become ten times
>>> more difficult. You have to use Agner Fog's manuals, not Intel/AMD.
>>>
>>> For example:
>>> (1) a common bottleneck on all Intel processors, is that you can only
>>> read from three registers per cycle, but you can also read from any
>>> register which has been modified in the last three cycles.
>>> (2) it's important to break dependency chains.
>>>
>>> On the BigInt code, instruction scheduling gave a speedup of ~40%.
>>
>> Wow. I didn't know that. Do any compilers currently schedule this stuff?
>
> Intel probably does. I don't think any others do a very good job. Agner
> told me that he had had no success in getting compiler vendors to be
> interested in his work.
Well, this one is. In fact, could we get Agner to actively help us out with this?
>> Any chance you want to take a look at cgsched.c? I had great success
>> using the same algorithm for the quite different Pentium and P6
>> scheduling minutia.
>
> That would really be fun.
> BTW, the current Intel processors are basically the same as Pentium Pro,
> with a few improvements. The strange thing is, because of all of the
> reordering that happens, swapping the order of two (non-dependent)
> instructions makes no difference at all. So you always need to look at
> every instruction in the a loop, before you can do any scheduling.
I was looking at Agner's document, and it looks like ordering the instructions
in the 4-1-1 or 4-1-1-1 for optimal decoding could work. This would fit right in
with the way the scheduler works.
I had thought that with the CPU automatically reordering instructions, that
scheduling them was obsolete.
More information about the Digitalmars-d
mailing list