primitive vector types

Daniel Keep daniel.keep.lists at gmail.com
Thu Feb 19 18:31:39 PST 2009


Andrei Alexandrescu wrote:
> Denis Koroskin wrote:
>> On Thu, 19 Feb 2009 22:25:04 +0300, Mattias Holm
>> <hannibal.holm at gmail.com> wrote:
>>
>>> Since (SIMD) vectors are so common and every reasonabe system support
>>> them in one way or the other (and scalar emulation of this is rather
>>> simple), why not have support for this in D directly?
>>>
>>> [snip]
>>
>> I don't see any reason why float4 can't be made a library type.
> 
> Yah, I was thinking the same:
> 
> struct float4
> {
>     __align(16) float[4] data; // right syntax and value?
>     alias data this;
> }
> 
> This looks like something that should go into std.matrix pronto. It even
> has value semantics even though fixed arrays don't :o/.
> 
> 
> Andrei

I remember implementing a vector struct [1] quite some time ago that had
an SSE-accelerated path.  There were three problems I had with it:

1. The alignment thing.  Incidentally, I just did a quick check and
don't see any notes in the changelog about __align(n) syntax.  As I
remember, there was no way to actually ensure the data was properly
aligned.  (There's "Data items in static data segment >= 16 bytes in
size are now paragraph aligned." but that doesn't help when the vectors
are on, say, the stack or in the heap.)

2. As soon as you use inline asm, you lose inlining.  When the functions
are as small as they are, this can be a bit of overhead.  It gets worse
when you realise that the CPU is spending most of its time running data
back and forth between main memory and the XMM registers...

   Array operations help, but they don't cover everything.

3. There was a not insignificant performance difference for using byref
passing on operators over byval passing.  Of course, you can't ACTUALLY
use byref because it completely breaks anything that uses a temporary
expression as an argument.

In the end, I just dropped it to see how BLADE would turn out.  I ended
up coming to the conclusion that while we can do a float[4] vector in D
and use SIMD to speed it up, there's not much point when BLADE is there.
 Of course, BLADE is a little unwieldy to use what with that mixin
malarky.  Pity we didn't get AST macros... :P

Anyway, just my AUD$0.02.

  -- Daniel


[1] That struct was scary.  It was one of those Vector!(type, size)
jobbies, so it had multiple paths through functions, members that only
existed for certain sizes, special-cased loop unrolling... don't even
ASK about the matrix struct... :P



More information about the Digitalmars-d mailing list