OOP, faster data layouts, compilers

Fri Apr 22 14:20:53 PDT 2011

On 4/22/2011 2:20 PM, bearophile wrote:
> Kai Meyer:
>
>> The purpose of the original post was to indicate that some low level
>> research shows that underlying data structures (as applied to video game
>> development) can have an impact on the performance of the application,
>> which D (I think) cares very much about.
>
> The idea of the original post was a bit more complex: how can we invent new/better ways to express semantics in D code that will not forbid future D compilers to perform a bit of changes in the layout of data structures to increase code performance? Complex transforms of the data layout seem too much complex for even a good compiler, but maybe simpler ones will be possible. And I think to do this the D code needs some more semantics. I was suggesting an annotation that forbids inbound pointers, that allows the compiler to move data around a little, but this is just a start.
>
> Bye,
> bearophile

In many ways the biggest thing I use regularly in game development that 
I would lose by moving to D would be good built-in SIMD support.  The PC 
compilers from MS and Intel both have intrinsic data types and 
instructions that cover all the operations from SSE1 up to AVX.  The 
intrinsics are nice in that the job of register allocation and 
scheduling is given to the compiler and generally the code it outputs is 
good enough (though it needs to be watched at times).

Unlike ASM, intrinsics can be inlined so your math library can provide a 
platform abstraction at that layer before building up to larger 
operations (like vectorized forms of sin, cos, etc) and algorithms (like 
frustum cull checks, k-dop polygon collision etc), which makes porting 
and reusing the algorithms to other platforms much much easier, as only 
the low level layer needs to be ported, and only outliers at the 
algorithm level need to be tweaked after you get it up and running.

On the consoles there is AltiVec (VMX) which is very similar to SSE in 
many ways.  The common ground is basically SSE1 tier operations : 128 
bit values operating on 4x32 bit integer and 4x32 bit float support.  64 
bit AMD/Intel makes SSE2 the minimum standard, and a systems language on 
those platforms should reflect that.

Loading and storing is comparable across platforms with similar 
alignment restrictions or penalties for working with unaligned data. 
Packing/swizzle/shuffle/permuting are different but this is not a huge 
problem for most algorithms.  The lack of fused multiply and add on the 
Intel side can be worked around or abstracted (i.e. always write code as 
if it existed, have the Intel version expand to multiple ops).

And now my wish list:

If you have worked with shader programming through HLSL or CG the 
expressiveness of doing the work in SIMD is very high.  If I could write 
something that looked exactly like HLSL but it was integrated perfectly 
in a language like D or C++, it would be pretty huge to me.  The amount 
of math you can have in a line or two in HLSL is mind boggling at times, 
yet extremely intuitive and rather easy to debug.