seeding the pot for 2.0 features [small vectors]

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Mon Jan 29 07:58:51 PST 2007


Mikola Lysenko wrote:
> Joel C. Salomon wrote:
>> As I understand it, D’s inline assembler would be the tool to use for 
>> this in a library implementation.  I don’t think the complex types use 
>> SIMD, so the vectors can be the only things using those registers.
>>
>>
> 
> I can tell you right now that this won't work.  I have tried using the 
> inline assembler with a vector class and the speedup was at barely 
> noticeable.  You can see the results here:  http://assertfalse.com
> 
> Here are just a few of the things that become a problem for a library 
> implementation:
> 
> 1. Function calls
> 
>     Inline assmeber can not be inlined.  Period.  The compiler has to 
> think of inline assembler as a sort of black box, which takes inputs one 
> way and returns them another way.  It can not poke around in there and 
> change your hand-tuned opcodes in order to pass arguments in arguments 
> more efficiently.  Nor can it change the way you allocate registers so 
> you don't accidentally trash the local frame.  It can't manipulate where 
> you put the result, such that it can be used immediately by the next 
> block of code.  Therefore any asm vector class will have a lot of 
> wasteful function calls which quickly add up:
> 
> 
> a = b + c * d;
> 
> becomes:
> 
> a = b.opAdd(c.opMul(d));
> 
> 
> 2. Register allocation
> 
>     This point is related to 1.  Most SIMD architectures have many 
> registers, and a good compiler can easily use that to optimize stuff 
> like parameter passing and function returns.  This is totally impossible 
> for a library to do, since it has no knowledge of the contents of any 
> registers as it executes.

Can GCC-like extended assembler (recently implemented in GDC: 
http://dgcc.sourceforge.net/gdc/manual.html) help for these first two 
points?
It allows you to let the compiler allocate registers. That should fix 
point two.
You can also tell the compiler where to put variables and where you're 
going to put any results. That means your asm doesn't necessarily need 
to access memory to do anything useful. If the compiler sees it doesn't 
inlining the function should be possible, I think.
It won't fix all asm, of course, but it might make it possible to write 
inlinable asm.

I do think it needs different syntax. Strings? AT&T asm syntax? Bah. But 
the idea itself is a good one, I think.



More information about the Digitalmars-d mailing list