Idempotent partition around median of 5?

Xinok via Digitalmars-d digitalmars-d at puremagic.com
Fri Feb 5 13:35:54 PST 2016


On Friday, 5 February 2016 at 17:33:55 UTC, Era Scarecrow wrote:
> On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
>> One swap usually decomposes into three moves.
>
>  Recently from reading some interesting hacks and super code, a 
> good swap can also be done via 3 xor operations (and avoiding 
> an intermediate storage unit).
>
> http://aggregate.ee.engr.uky.edu/MAGIC/#Swap%20Values%20Without%20a%20Temporary
>
> x ^= y;
> y ^= x;
> x ^= y;
>
>  Since these can be used directly as registers it might be 
> faster (assuming all 5 are mapped to registers until the final 
> write out, which might not work).

It's a cool trick but I would be surprised if this were actually 
faster. Modern x64 processors have 16 "general purpose" registers 
so saving a single register for a simple set of instructions is 
likely to not have any benefit. My other thought is that the 
compiler may not recognize this pattern, making it more difficult 
to optimize. On the other hand, compilers are very good at 
rearranging simple reads and writes to avoid stalling the 
pipeline in the CPU.


More information about the Digitalmars-d mailing list