Thread-local storage and Performance

Walter Bright newshound1 at digitalmars.com
Mon Oct 26 11:35:39 PDT 2009


dsimcha wrote:
> == Quote from Pelle Månsson (pelle.mansson at gmail.com)'s article
>> dsimcha wrote:
>>> Has D's builtin TLS been optimized in the past 6 months to year?  I had
>>> benchmarked it awhile back when optimizing some code that I wrote and
>>> discovered it was significantly slower than regular globals (the kind that are
>>> now __gshared).  Now, at least on Windows, it seems that there is no
>>> discernible difference and if anything, TLS is slightly faster than __gshared.
>>>  What's changed?
>> I was under the impression that TLS should be faster due to absence of
>> synchronization.
> 
> __gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.
> 
> Without getting into the details of my specific case, the reason I'm interested in
> this is that I have some code that I want to be as fast as possible in both
> single- and multithreaded environments.  Right now, it has a hack that checks
> thread_needLock() and uses plain old globals for everything as long as the program
> is single-threaded because that seemed faster than TLS lookups a while ago.
> However, running the same benchmark again shows otherwise.

Nothing has changed. What I would do is to look at the assembler output 
and verify that the TLS globals really are TLS, and the ones that are 
not are really not.



More information about the Digitalmars-d mailing list