AVX for math code ... avx instructions later disappearing ?
james.p.leblanc
james.p.leblanc at gmail.com
Sun Sep 26 18:08:46 UTC 2021
Dear D-ers,
I enjoyed reading some details of incorporating AVX into math code
from Johan Engelen's programming blog post:
http://johanengelen.github.io/ldc/2016/10/11/Math-performance-LDC.html
Basically, one can use the ldc compiler to insert avx code, nice!
In playing with some variants of his example code, I realize
that there are issues I do not understand. For example, the
following
code successfully incorporates the avx instructions:
```d
// File here is called dotFirst.d
import ldc.attributes : fastmath;
@fastmath
double dot( double[] a, double[] b)
{
double s = 0.0;
foreach (size_t i; 0 .. a.length) {
s += a[i] * b[i];
}
return s;
}
double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
void main()
{
double z = 0.0;
z = dot(x, y);
}
```
If we run:
ldc2 -c -output-s -O3 -release dotFirst.d -mcpu=haswell
echo "Results of grep ymm dotFirst.s:"
grep ymm dotFirst.s
The "grep" shows a number of vector instructions, such as:
**vfmadd132pd 160(%rcx,%rdi,8), %ymm5, %ymm1**
However, subtle changes in the code (such as moving the dot
product
function to a module, or even moving the array declarations to
before
the dot product function, and the avx instructions will disappear!
```d
import ldc.attributes : fastmath;
@fastmath
double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double dot( double[] a, double[] b)
{
double s = 0.0;
foreach (size_t i; 0 .. a.length) {
...
```
Now a grep will not find a single **ymm**.
It is understood that ldc needs proper alignment to be able to do
the vector
instructions...
**But my question is:** how is proper alignment guaranteed?
(Most importantly
how guaranteed among code using modules)?? (There are related
stack alignment
issues -- 16?)
Best Regards,
James
PS I have come across scattered bits of (sometimes contradictory)
information on
avx/simd for dlang. Is there a canonical source for vector info?
More information about the Digitalmars-d-learn
mailing list