Optimizing a raytracer

"Róbert László Páli" robertlaszlopali at gmail.com
Thu Oct 17 07:25:15 PDT 2013


@Jacob Carlborg
> I would say use structs. For compiler I would go with LDC or 
> GDC. Both of these are faster for floating point calculations 
> than DMD. You can always benchmark.

Thank you for the advice!
I installed ldc and used ldmd2.
Te benchmarks are amazing! :O

DMD > compile = 2503 > run = 26210
LDMD > compile = 3953 > run = 8935

These are in milliseconds,
benchmarked with time command.
Both were compiled with smae Flags:
-O -inline -release -noboundscheck

@finalpatch
> I find it critical to ensure all loops are unrolled in basic 
> vector ops (copy/arithmathc/dot etc.)

In these crucial parts I don't use loops,
made these operations by hand. There
are simple 3 named doubles.
But thanks for the advice.

@ponce
> If you are on x86, SSE 4.1 introduced an instruction called 
> DPPS which performs a dot product. Maybe you can force it into 
> doing a cross-product with clever swizzles and masks.

Could you give me a hint, how it could
be implemented in D to use that dot product?
I am not expirienced with such low-level programming.

And would you suggest to try to use
SIMD double4 for 3D vectors? It would
take some time to change code.


More information about the Digitalmars-d mailing list