System programming in D (Was: The God Language)
    a 
    a at a.com
       
    Thu Dec 29 05:13:02 PST 2011
    
    
  
Kapps Wrote:
> Agreed.
> 
> There are plenty of real-world, even 'common' examples where the lack of 
> being able to force inlining for a function is a problem. The main one 
> I've run into is not being able to inline functions with assembly, thus 
> not being able to implement efficient SIMD operations.
The problem is not just inlining but also needless loads and stores at the beginnings and ends of asm blocks. For example in the following code:
void test(ref V a, ref V b)
{
    asm
    {
        movaps XMM0, a;
        addps  XMM0, b;
        movaps a, XMM0;
    }
    asm
    {
        movaps XMM0, a;
        addps  XMM0, b;
        movaps a, XMM0;
    }
}
compiles to:
   0:   55                      push   %rbp
   1:   48 8b ec                mov    %rsp,%rbp
   4:   48 83 ec 10             sub    $0x10,%rsp
   8:   48 89 7d f0             mov    %rdi,-0x10(%rbp)
   c:   48 89 75 f8             mov    %rsi,-0x8(%rbp)
  10:   0f 28 45 f8             movaps -0x8(%rbp),%xmm0
  14:   0f 58 45 f0             addps  -0x10(%rbp),%xmm0
  18:   0f 29 45 f8             movaps %xmm0,-0x8(%rbp)
  1c:   0f 28 45 f8             movaps -0x8(%rbp),%xmm0
  20:   0f 58 45 f0             addps  -0x10(%rbp),%xmm0
  24:   0f 29 45 f8             movaps %xmm0,-0x8(%rbp)
  28:   48 8b e5                mov    %rbp,%rsp
  2b:   5d                      pop    %rbp
  2c:   c3                      retq   
The needles loads and stores would make it impossible to write an efficient simd add function even if the functions containing asm blocks could be inlined. 
    
    
More information about the Digitalmars-d
mailing list