A little Py Vs C++
Kapps
opantm2+spam at gmail.com
Fri Nov 2 14:32:29 PDT 2012
On Friday, 2 November 2012 at 14:22:34 UTC, Jens Mueller wrote:
> But the compiler knows about the alignment, doesn't it?
>
> align(16) float[4] a;
> vs
> float[4] a;
>
> In the former case the compiler can generate better code and it
> should.
> The above syntax is not supported. But my point is all the
> compiler
> cares about is the alignment which can be specified in the code
> somehow.
> Sorry for being stubborn.
>
> Jens
Note: My knowledge of SIMD/SSE is fairly limited, and may be
somewhat out of date. In other words, some of this may be flat
out wrong.
First, just because you have something that can have SIMD
operations performed on it, doesn't mean you necessarily want to.
SSE instructions for example have to store things in the XMM
registers, and accessing the actual values of individual elements
in the vector is expensive. When using SSE, you want to avoid
accessing individual elements as much as possible. Not following
this tends to hurt performance quite badly. Yet when you just
have a float[4], you may or may not be frequently or infrequently
accessing individual elements. The compiler can't know whether
you use it as a single SIMD vector more often, or use it to
simply store 4 elements more often. You could be aligning it for
any reason, so it's not too fair a way of determining it.
Secondly, you can't really know which SIMD instructions are
supported by your target CPU. It's safe to say SSE2 is supported
for pretty much all x86 CPUs at this point, but something like
SSE4.2 instructions may not be. Just because the compiler knows
that the CPU compiling it supports it doesn't mean that the CPU
running the program will have those instructions.
Lastly, we'd still need SIMD intrinsics. It may be simple to tell
that a float[4] + float[4] operation could use addps, but it
would be more difficult to determine when to use something like
dotps (dot product across two SIMD vectors), and various other
instructions. Not to mention, non-x86 architectures.
More information about the Digitalmars-d
mailing list