> As far as I know, the cost for cache coherency comes when core-to-core > transfer is required. Can we expect thread-local storage to be faster than shared memory ?