OOP, faster data layouts, compilers

bearophile bearophileHUGS at lycos.com
Fri Apr 22 20:05:54 PDT 2011


Sean Cavanaugh:

> In C++ the intrinsics are easily wrapped by __forceinline global
> functions, to provide a platform abstraction against the intrinsics.

When AVX will become 512 bits wide, or you need to use a very different set of vector register, your global functions need to change, so the code that calls them too has to change. This is acceptable for library code, but it's not good for D built-ins operations. D built-in vector ops need to be more clean, general and long-lasting, even if they may not fully replace SSE intrinsics.


> I would say in D this could be faked provided the language at a minimum
> understood what a 128 (SSE1 through 4.2) and 256 bit value (AVX) was and
> how to efficiently move it via registers for function calls.

Also think about what the D ABI will be 15-25 years from now. D design must look a bit more forward too.


> Now the original topic pertains to data layouts,

It was about how to not preclude future D compilers from shuffling data around a bit by themselves :-)


> I would argue the above
> code is an idealistic example, as when writing SIMD code you almost
> always have to transpose or rotate one of the sets of data to work in
> parallel across the other one.

Right.


> float4 a = {1,2,3,4};
> float4 b = {5,6,7,8};
> float4 c = {-1,0,1,2};
> float4 d = {0,0,0,0};
> float4 foo = (c > d) ? a : b;

Recently I have asked for a D vector comparison operation too, (the compiler is supposed able to splits them into register-sized chunks for the comparisons), this is good for AVX instructions (a little problem here is that I think currently DMD allocates memory on heap to instantiate those four little arrays):

int[4] a = [1,2,3,4];
int[4] b = [5,6,7,8]
int[4] c = [-1,0,1,2];
int[4] d = [0,0,0,0];
int[4] foo = (c[] > d[]) ? a[] : b[];


> Things get real messy when you have multiple vertex attributes as
> decisions to keep them together or separate are conflicting and both
> choices make sense to different systems :)

It's not easy for future compilers to perform similar auto-vectorizations :-)

Bye and thank you for your answer,
bearophile


More information about the Digitalmars-d mailing list