Vector operations optimization.

Comrad comrad.karlovich at googlemail.com
Thu Mar 22 22:57:06 PDT 2012


On Thursday, 22 March 2012 at 10:43:35 UTC, Trass3r wrote:
>> What is the status at the moment? What compiler and with which 
>> compiler flags I should use to achieve maximum performance?
>
> In general gdc or ldc. Not sure how good vectorization is 
> though, esp. auto-vectorization.
> On the other hand the so called vector operations like a[] = 
> b[] + c[]; are lowered to hand-written SSE assembly even in dmd.

I had such a snippet to test:

   1 import std.stdio;
   2 void main()
   3 {
   4   double[2] a=[1.,0.];
   5   double[2] a1=[1.,0.];
   6   double[2] a2=[1.,0.];
   7   double[2] a3=[0.,0.];
   8   foreach(i;0..1000000000)
   9     a3[]+=a[]+a1[]*a2[];
  10   writeln(a3);
  11 }

And I compared with the following d code:

   1 import std.stdio;
   2 void main()
   3 {
   4   double[2] a=[1.,0.];
   5   double[2] a1=[1.,0.];
   6   double[2] a2=[1.,0.];
   7   double[2] a3=[0.,0.];
   8   foreach(i;0..1000000000)
   9   {
  10     a3[0]+=a[0]+a1[0]*a2[0];
  11     a3[1]+=a[1]+a1[1]*a2[1];
  12   }
  13   writeln(a3);
  14 }

And with the following c code:

   1 #include  <stdio.h>
   2 int main()
   3 {
   4   double a[2]={1.,0.};
   5   double a1[2]={1.,0.};
   6   double a2[2]={1.,0.};
   7   double a3[2];
   8   unsigned i;
   9   for(i=0;i<1000000000;++i)
  10   {
  11     a3[0]+=a[0]+a1[0]*a2[0];
  12     a3[1]+=a[1]+a1[1]*a2[1];
  13   }
  14   printf("%f %f\n",a3[0],a3[1]);
  15   return 0;
  16 }

The last one I compiled with gcc two previous with dmd and ldc. C 
code with -O2
was the fastest and as fast as d without slicing compiled with 
ldc. d code with slicing was 3 times slower (ldc compiler). I 
tried to compile with different optimization flags, that didn't 
help. Maybe I used the wrong ones. Can someone comment on this?


More information about the Digitalmars-d-learn mailing list