druntime thread_needLock()

Sun Dec 7 00:05:21 PST 2008

On 2008-12-07 03:48:40 +0100, Sean Kelly <sean at invisibleduck.org> said:

> Fawzi Mohamed wrote:
>> On 2008-12-06 17:13:34 +0100, Sean Kelly <sean at invisibleduck.org> said:
>> 
>>> Fawzi Mohamed wrote:
>>>> 
>>>> a memory barrier would be needed, and atomic decrements, but I see that 
>>>> it is not portable...
>>> 
>>> It would also somewhat defeat the purpose of thread_needLock, since IMO 
>>> this routine should be fast.  If memory barriers are involved then it 
>>> may as well simply use a mutex itself, and this is exactly what it's 
>>> intended to avoid.
>> 
>> the memory barrier would be needed in the code that decrements the 
>> number of active threads, so that you are sure that no pending writes 
>> are still there, (that is the problem that you said brought you to 
>> switch to a multithreaded flag), not in the code of thread_needLock...
> 
> Not true.  You would need an acquire barrier in thread_needLock. 
> However, on x86 the point is probably moot since loads have acquire 
> semantics anyway.

You would need a very good processor to reorder speculative loads 
before a function call and a branch. As far as I know even alpha did 
not do it.
A volatile statement will probably be enough in all cases, but you are 
right that to be really correct a load barrier should be done, an even 
in a processor where this might matter the cost of it in the fast path 
will be basically 0 (so still better than a lock).

> 
>> But again I would say that this optimization is not really worth it (as 
>> you also said it), even if it is relevant for GUI applications.
> 
> :-)
> 
> 
> Sean