Performance of tables slower than built in?

Fri May 24 11:45:46 UTC 2019

On Friday, 24 May 2019 at 08:33:34 UTC, Ola Fosheim Grøstad wrote:
> On Thursday, 23 May 2019 at 21:47:45 UTC, Alex wrote:
>> Either way, sin it's still twice as fast. Also, in the code 
>> the sinTab version is missing the writeln so it would have 
>> been faster.. so it is not being optimized out.
>
> Well, when I run this modified version:
>
> https://gist.github.com/run-dlang/9f29a83b7b6754da98993063029ef93c
>
> on https://run.dlang.io/
>
> then I get:
>
> LUT:    709
> sin(x): 2761
>
> So the LUT is 3-4 times faster even with your quarter-LUT 
> overhead.

FWIW, as far as I can tell I managed to get the lookup version 
down to 104 by using bit manipulation tricks like these:

auto fastQuarterLookup(double x){
     const ulong mantissa = cast(ulong)( (x - floor(x)) * 
(cast(double)(1UL<<63)*2.0) );
     const double sign = 
cast(double)(-cast(uint)((mantissa>>63)&1));
     … etc

So it seems like a quarter-wave LUT is 27 times faster than sin…

You just have to make sure that the generated instructions fills 
the entire CPU pipeline.