Also to avoid cache line thrashing, sort parallel swaps by leftmost index. E.g. this line: 18,19, 20,21, 2,4, 1,3, 0,5, 6,8, 7,9, 10,12, 11,13, becomes: 0,5, 1,3, 2,4, 6,8, 7,9, 10,12, 11,13, 18,19, 20,21, Andrei