SIMD support...
Peter Alexander
peter.alexander.au at gmail.com
Fri Jan 6 13:56:38 PST 2012
On 6/01/12 4:28 AM, Martin Nowak wrote:
> I also don't think that we can efficiently provide arbitrary alignment
> for stack variables.
> The performance penalty will kill your efforts.
> Gcc doesn't do it either.
Vector registers are usually plentiful, so spilling to stack is rare. An
(additional) performance penalty when this happens is acceptable as the
performance-minded will avoid spillage anyway.
> Providing intrinsics should happen through library support.
That's a contradiction in terms. Intrinsics by definition are not
contained in libraries. The compiler needs to know specifics of their
implementation so that it can do register allocation and instruction
scheduling.
> Either through expression templates or with GPGPU in mind using
> a DSL compiler for string mixins.
>
> auto result = vectorize!q{
> auto v = float4(a, b, c, d);
> auto v2 = float4(2 * a, 2.0, c - d, d + a);
> auto v3 = v * v2;
> auto v4 = __hadd(v3, v3);
> auto v5 = __hadd(v4, v4);
> return v5[0];
> }(0.2, 0.2, 0.3, 0.4);
So you suggest we implement a compiler in compile-time D?
There's several things wrong with this:
1. It wouldn't be able to interact with the host compiler's register
allocation and instruction scheduling. The best it could do is paste
blocks of asm inline into your code. Any interaction with variables or
functions outside that block would incur overhead.
2. Unless you write a whole D compiler, you won't be able to use D
expressions without your vectorization code (e.g. use CTFE to provide
some of those constants).
3. It only works in self-contained code like this. What if I have a
normal D function but I want one parameter passed in a vector register
and intermix normal D code and vector code?
4. Errors within your vectorize code wouldn't be able to provide useful
line numbers.
5. Your vectorize code can't benefit from syntax highlighting.
6. It would no doubt slow compile times.
7. It's probably just as much of an undertaking as adding it to the
actual compiler.
More information about the Digitalmars-d
mailing list