performance issues with SIMD function
Bogdan
contact at szabobogdan.com
Sat Nov 4 11:06:02 UTC 2023
On Friday, 3 November 2023 at 15:32:08 UTC, Sergey wrote:
> On Friday, 3 November 2023 at 15:11:31 UTC, Bogdan wrote:
>> Hi everyone,
>>
>> I was playing around with the intel-intrinsics library, trying
>> to improve the speed of a simple area function. I could not
>> see any performance improvements from the non-SIMD
>> implementation. The SIMD version is a little bit slower even
>> with LDC2 and --o3. Can anyone help me to understand what I am
>> missing?
>>
>> Thanks!
>> Bogdan
>
> In your SIMD algorithm has not so many gain from using SIMD.
> The length of the loop is the same.
> Also probably compiler applying some optimizations in regular
> versions, that doing almost the same.
I think it was from the way I was running the benchmark:
```d
////
auto begin = Clock.currTime;
foreach (i; 0..100_000) {
res1 = areaMeters(polygon);
}
writeln("No SIMD ", Clock.currTime - begin);
////
begin = Clock.currTime;
foreach (i; 0..100_000) {
res2 = areaMetersSimd2(polygon);
}
writeln("SIMD ", Clock.currTime - begin);
```
gives me:
```
No SIMD 1 sec, 80 ms, 765 μs, and 1 hnsec
SIMD 1 sec, 120 ms, 765 μs, and 1 hnsec
```
```d
////
auto begin = Clock.currTime;
res1 = areaMeters(polygon);
writeln("No SIMD ", Clock.currTime - begin);
////
begin = Clock.currTime;
res2 = areaMetersSimd2(polygon);
writeln("SIMD ", Clock.currTime - begin);
```
gives me:
```
No SIMD 19 μs and 3 hnsecs
SIMD 16 μs and 8 hnsecs
```
More information about the Digitalmars-d-learn
mailing list