Also if you read a shared value with atomicLoad every time, this disallows caching in registers or on stack, which is also performance hit. The shared value should be read once and cached if possible.