SIMD
Wyverex
wyverex.cypher at gmail.com
Fri Aug 15 08:03:14 PDT 2008
Don wrote:
> Wyverex wrote:
>> Was messing around with SIMD, SSE stuff.. didn't know how much faster
>> it could be! Its been a few years since I did any assembly.
>> Though Id just share this, any word of adding this to the lib or
>> compiler optimizations for this?
>>
>>
>> my results:
>> Parallel Single
>>
>> SQRTPS:0.000120 FSQRT:0.001021
>> SQRTPS:0.000114 FSQRT:0.001026
>> SQRTPS:0.000114 FSQRT:0.001021
>> SQRTPS:0.000114 FSQRT:0.001026
>>
>>
>> codepad if you wish to play with it..
>> http://codepad.org/oqq5jsbJ
>>
>> ...times from codepad
>> SQRTPS:0.000291 FSQRT:0.000634
>> SQRTPS:0.000289 FSQRT:0.000632
>
>> asm
>> {
>> mov EAX, [pa];
>> mov EBX, [pb];
>> mov ECX, times; //error on a.length
>>
>> REP2:
>> fldpi float ptr[EAX];
>
> shouldn't that be fld ?
> fldpi loads 3.1415.... !
> Doesn't make any difference to the time, though.
>
> This is exactly why D just got array operations.
i had fld first but sqrt of 5 or higher came back as -nan....
More information about the Digitalmars-d-learn
mailing list