SIMD benchmark

Tue Jan 17 00:42:12 PST 2012

On 16/01/12 17:51, Martin Nowak wrote:
> On Mon, 16 Jan 2012 17:17:44 +0100, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org> wrote:
>
>> On 1/15/12 12:56 AM, Walter Bright wrote:
>>> I get a 2 to 2.5 speedup with the vector instructions on 64 bit Linux.
>>> Anyhow, it's good enough now to play around with. Consider it alpha
>>> quality. Expect bugs - but make bug reports, as there's a serious lack
>>> of source code to test it with.
>>> -----------------------
>>> import core.simd;
>>>
>>> void test1a(float[4] a) { }
>>>
>>> void test1()
>>> {
>>> float[4] a = 1.2;
>>> a[] = a[] * 3 + 7;
>>> test1a(a);
>>> }
>>>
>>> void test2a(float4 a) { }
>>>
>>> void test2()
>>> {
>>> float4 a = 1.2;
>>> a = a * 3 + 7;
>>> test2a(a);
>>> }
>>
>> These two functions should have the same speed. The function that
>> ought to be slower is:
>>
>> void test1()
>> {
>> float[5] a = 1.2;
>> float[] b = a[1 .. $];
>> b[] = b[] * 3 + 7;
>> test1a(a);
>> }
>>
>>
>> Andrei
>
> Unfortunately druntime's array ops are a mess and fail
> to speed up anything below 16 floats.
> Additionally there is overhead for a function call and
> they have to check alignment at runtime.
>
> martin

Yes. The structural problem in the compiler is that array ops get turned 
into function calls far too early. It happens in the semantic pass, but 
it shouldn't happen in the front-end at all -- it should be done in the 
glue layer, at the beginning of code generation.

Incidentally, this is the reason that CTFE doesn't work with array ops.