primitive vector types

Don nospam at nospam.com
Mon Feb 23 00:18:33 PST 2009


Mattias Holm wrote:
> On 2009-02-21 17:03:06 +0100, Don <nospam at nospam.com> said:
>>
>> I don't think that's messy at all. I can't see much difference between 
>> special support for float[4] versus float4. It's better if the code 
>> can take advantage of hardware without specific support. Bear in mind 
>> that SSE/SSE2 is a temporary situation. AVX provides for much longer 
>> arrays of vectors; and it's extensible. You'd end up needing to keep 
>> adding on special types whenever a new CPU comes out.
>>
>> Note that the fundamental concept which is missing from the C virtual 
>> machine is that all modern machines can efficiently perform operations 
>> on arrays of built-in types of length 2^n, for some small value of n.
>> We need to get this into the language abstraction. Not follow C++ in 
>> hacking a few extra special types onto the old, deficient C model. And 
>> I think D is actually in a position to do this.
>>
>> float[4] would be a greatly superior option if it could be done.
>> The key requirements are:
>> (1) need to specify that static arrays are passed by value.
>> (2) need to keep stack aligned to 16.
>> The good news is that both of these appear to be done on DMD2-Mac!
> 
> Yes, float[4] would be ok, if some CPU independent permutation support 
> can be added. Would this be with some intrinsic then or what? I very 
> much like the OpenCL syntax for permutation, but I suppose that an 
> intrinsic such as "float[4] noref permute(float[4] noref vec, int 
> newPos0, int newPos1, int newPos2, int newPos3)" would work as well. 
> Note that this should also work with double[2], byte[16], short[8] and 
> int[4].

Note that if you had static arrays with value semantics, with proper 
alignment, then you could simply create

module std.swizzle;
float[4] permute(float[4] vec, int newPos0, int newPos1, int newPos2, 
int newPos3);  /* intrinsic */

float[4] wzyx(float[4] q) { return permute(q, 4, 3, 2, 1); }
float[4] xywz(float[4] q) { return permute(q, 1, 2, 4, 3); }
// etc

---
and your code would be:

import std.swizzle;

void main()
{
    float[4] t;
    auto u = t.wzyx;
}

I don't think this is terribly difficult once the value semantics are in 
place.
(Note that once you get beyond 4 members, the .xyzw syntax gives an 
explosion of functions; but I think it's workable at 4; 4! is only 24.
Beyond that point, you'd probably require direct permute calls).



More information about the Digitalmars-d mailing list