Performance of tables slower than built in?
    Ola Fosheim Grøstad 
    ola.fosheim.grostad at gmail.com
       
    Fri May 24 12:20:12 UTC 2019
    
    
  
On Friday, 24 May 2019 at 12:01:55 UTC, Alex wrote:
> Well, the QuarterWave was suppose to generate just a quarter 
> since that is all that is required for these functions due to 
> symmetry and periodicity. I started with a half to get that 
> working then figure out the sign flipping.
Sure, it is a tradeoff. You pollute the cache less this way, but 
you have to figure out the sign and the lookup-direction.
The trick is then to turn the phase into an unsigned integer then 
you get:
1. the highest bit will tell you that you need to use the inverse 
sign for the result.
2. the next highest bit will tell you that you need too look up 
in the reverse direction
What is key to performance here is that x86 can do many simple 
integer/bit operations in parallel, but only a few floating point 
operations.
Also avoid all conditionals. Use bitmasking instead, something 
along the line of:
const ulong phase = mantissa^((1UL<<63)-((mantissa>>62)&1));
const uint quarterphase = (phase>>53)&511;
(Haven't checked the correctness of this, but this shows the 
general principle.)
Ola.
    
    
More information about the Digitalmars-d-learn
mailing list