fmohamed at mac.com
Sun Dec 7 00:05:21 PST 2008
On 2008-12-07 03:48:40 +0100, Sean Kelly <sean at invisibleduck.org> said:
> Fawzi Mohamed wrote:
>> On 2008-12-06 17:13:34 +0100, Sean Kelly <sean at invisibleduck.org> said:
>>> Fawzi Mohamed wrote:
>>>> a memory barrier would be needed, and atomic decrements, but I see that
>>>> it is not portable...
>>> It would also somewhat defeat the purpose of thread_needLock, since IMO
>>> this routine should be fast. If memory barriers are involved then it
>>> may as well simply use a mutex itself, and this is exactly what it's
>>> intended to avoid.
>> the memory barrier would be needed in the code that decrements the
>> number of active threads, so that you are sure that no pending writes
>> are still there, (that is the problem that you said brought you to
>> switch to a multithreaded flag), not in the code of thread_needLock...
> Not true. You would need an acquire barrier in thread_needLock.
> However, on x86 the point is probably moot since loads have acquire
> semantics anyway.
You would need a very good processor to reorder speculative loads
before a function call and a branch. As far as I know even alpha did
not do it.
A volatile statement will probably be enough in all cases, but you are
right that to be really correct a load barrier should be done, an even
in a processor where this might matter the cost of it in the fast path
will be basically 0 (so still better than a lock).
>> But again I would say that this optimization is not really worth it (as
>> you also said it), even if it is relevant for GUI applications.
More information about the Digitalmars-d