LLVM and TLS

Sat Feb 21 20:33:58 PST 2015

"Ola Fosheim "Grøstad\"" <ola.fosheim.grostad+dlang at gmail.com> writes:

> On Wednesday, 18 February 2015 at 20:05:58 UTC, Jonathan Marler wrote:
>>
>> If I turn on optimization they both take 7 milliseconds.
>
> You cannot benchmark it like this. To make it more realistic you
> should use multiple compilation units, add fences and cache
> invalidation.

Hmm, you got me thinking.  A mfence should not be needed for TLS so in a
MT program, expensive TLS lookup could still win.  If cache is blown,
wouldn't time to reload cache begin to dominate?  I know all of this is
very architecture dependent, but I have been wary of the number of
instructions to do TLS lookup compared to shared.  Perhaps I should not.
Am I thinking correctly?
--
Dan