primitive vector types
Don
nospam at nospam.com
Mon Feb 23 00:18:33 PST 2009
Mattias Holm wrote:
> On 2009-02-21 17:03:06 +0100, Don <nospam at nospam.com> said:
>>
>> I don't think that's messy at all. I can't see much difference between
>> special support for float[4] versus float4. It's better if the code
>> can take advantage of hardware without specific support. Bear in mind
>> that SSE/SSE2 is a temporary situation. AVX provides for much longer
>> arrays of vectors; and it's extensible. You'd end up needing to keep
>> adding on special types whenever a new CPU comes out.
>>
>> Note that the fundamental concept which is missing from the C virtual
>> machine is that all modern machines can efficiently perform operations
>> on arrays of built-in types of length 2^n, for some small value of n.
>> We need to get this into the language abstraction. Not follow C++ in
>> hacking a few extra special types onto the old, deficient C model. And
>> I think D is actually in a position to do this.
>>
>> float[4] would be a greatly superior option if it could be done.
>> The key requirements are:
>> (1) need to specify that static arrays are passed by value.
>> (2) need to keep stack aligned to 16.
>> The good news is that both of these appear to be done on DMD2-Mac!
>
> Yes, float[4] would be ok, if some CPU independent permutation support
> can be added. Would this be with some intrinsic then or what? I very
> much like the OpenCL syntax for permutation, but I suppose that an
> intrinsic such as "float[4] noref permute(float[4] noref vec, int
> newPos0, int newPos1, int newPos2, int newPos3)" would work as well.
> Note that this should also work with double[2], byte[16], short[8] and
> int[4].
Note that if you had static arrays with value semantics, with proper
alignment, then you could simply create
module std.swizzle;
float[4] permute(float[4] vec, int newPos0, int newPos1, int newPos2,
int newPos3); /* intrinsic */
float[4] wzyx(float[4] q) { return permute(q, 4, 3, 2, 1); }
float[4] xywz(float[4] q) { return permute(q, 1, 2, 4, 3); }
// etc
---
and your code would be:
import std.swizzle;
void main()
{
float[4] t;
auto u = t.wzyx;
}
I don't think this is terribly difficult once the value semantics are in
place.
(Note that once you get beyond 4 members, the .xyzw syntax gives an
explosion of functions; but I think it's workable at 4; 4! is only 24.
Beyond that point, you'd probably require direct permute calls).
More information about the Digitalmars-d
mailing list