OOP, faster data layouts, compilers
Don
nospam at nospam.com
Tue Apr 26 01:01:06 PDT 2011
Sean Cavanaugh wrote:
> On 4/22/2011 2:20 PM, bearophile wrote:
>> Kai Meyer:
>>
>>> The purpose of the original post was to indicate that some low level
>>> research shows that underlying data structures (as applied to video game
>>> development) can have an impact on the performance of the application,
>>> which D (I think) cares very much about.
>>
>> The idea of the original post was a bit more complex: how can we
>> invent new/better ways to express semantics in D code that will not
>> forbid future D compilers to perform a bit of changes in the layout of
>> data structures to increase code performance? Complex transforms of
>> the data layout seem too much complex for even a good compiler, but
>> maybe simpler ones will be possible. And I think to do this the D code
>> needs some more semantics. I was suggesting an annotation that forbids
>> inbound pointers, that allows the compiler to move data around a
>> little, but this is just a start.
>>
>> Bye,
>> bearophile
>
>
> In many ways the biggest thing I use regularly in game development that
> I would lose by moving to D would be good built-in SIMD support. The PC
> compilers from MS and Intel both have intrinsic data types and
> instructions that cover all the operations from SSE1 up to AVX. The
> intrinsics are nice in that the job of register allocation and
> scheduling is given to the compiler and generally the code it outputs is
> good enough (though it needs to be watched at times).
>
> Unlike ASM, intrinsics can be inlined so your math library can provide a
> platform abstraction at that layer before building up to larger
> operations (like vectorized forms of sin, cos, etc) and algorithms (like
> frustum cull checks, k-dop polygon collision etc), which makes porting
> and reusing the algorithms to other platforms much much easier, as only
> the low level layer needs to be ported, and only outliers at the
> algorithm level need to be tweaked after you get it up and running.
>
> On the consoles there is AltiVec (VMX) which is very similar to SSE in
> many ways. The common ground is basically SSE1 tier operations : 128
> bit values operating on 4x32 bit integer and 4x32 bit float support. 64
> bit AMD/Intel makes SSE2 the minimum standard, and a systems language on
> those platforms should reflect that.
Yes. It is for primarily for this reason that we made static arrays
return-by-value. It is intended that on x86, float[4] will be an SSE1
register.
So it should be possible to write SIMD code with standard array
operations. (Note that this is *much* easier for the compiler, than
trying to vectorize scalar code).
This gives syntax like:
float[4] a, b, c;
a[] += b[] * c[];
(currently works, but doesn't use SSE, so has dismal performance).
>
> Loading and storing is comparable across platforms with similar
> alignment restrictions or penalties for working with unaligned data.
> Packing/swizzle/shuffle/permuting are different but this is not a huge
> problem for most algorithms. The lack of fused multiply and add on the
> Intel side can be worked around or abstracted (i.e. always write code as
> if it existed, have the Intel version expand to multiple ops).
>
> And now my wish list:
>
> If you have worked with shader programming through HLSL or CG the
> expressiveness of doing the work in SIMD is very high. If I could write
> something that looked exactly like HLSL but it was integrated perfectly
> in a language like D or C++, it would be pretty huge to me. The amount
> of math you can have in a line or two in HLSL is mind boggling at times,
> yet extremely intuitive and rather easy to debug.
More information about the Digitalmars-d
mailing list