LLVM and TLS

Mon Feb 23 06:15:57 PST 2015

On Monday, 23 February 2015 at 04:10:29 UTC, Jonathan Marler 
wrote:
> Here's what happened: I was writing a program that could 
> optionally use TLS memory.  When I turned on TLS memory it 
> slowed down considerably, but only when using an LLVM compiler.
>  No matter how I used TLS, it was much much slower when using 
> LLVM.  The simple program is just a simple way to demonstrate 
> that TLS is very slow in one specific type of program.

Yeah, demonstrating that it slow is reasonable. I was more 
thinking about the other direction, that either globals or TLS is 
fast is hard to show without a multi-threaded best-of-breed 
baseline to compare against. (i.e. that TLS is faster than 
globals or the other way around does not say much since they both 
can be too slow if the code gen is lacking...)

> It would be great to see another program that could demonstrate 
> that TLS is actually faster in some use cases.  However, since 
> it it sooo much slower, I think you'll have a hard time finding 
> such an example.  The simple program demonstrates that TLS is 
> almost 2 orders of magnitude slower...it may not be that much 
> slower in other types of programs...but with numbers like that 
> it seem obvious that something is wrong.

Some other wrongs with naive TLS is that every thread gets the 
same dataset, that you pollute 3rd level cache compared to 
globals, and that globals can be fetched without a register 
(absolute addressing or relative to program counter). I'd be vary 
of using TLS for larger datstructures, but putting a pointer 
there instead gives you YET another indirection-> more cache 
misses...