Unsynchronized int access from threads

Fri Jun 19 05:54:31 UTC 2020

On Fri, Jun 19, 2020 at 06:00:32AM +0200, Timon Gehr via Digitalmars-d wrote:
> On 19.06.20 00:15, H. S. Teoh wrote:
> > Oh well, I'll just think of another way of parallelizing my code.
> 
> I think Steven just says that your reads and writes have to be atomic
> for the behavior to be defined at language level, so using core.atomic
> with memory order `raw` may in fact do what you want. (Or not, I am
> not sure what your full use case is.) I don't think in terms of
> generated assembly this will differ from what you had in mind as
> aligned word stores and loads are atomic on x86.

What I meant is that I'm thinking of optimizing at a higher level, to
arrange it in such a way that I don't need to lock. E.g., if scanning
for zeroes can't easily multithread with updating the array, perhaps the
better way to do it is to let them run sequentially, but multithread
across multiple instances of the computation instead.

I did some experiments along this line, and I've been finding that I'm
able to boost performance by 2x by multithreading at a higher level in
the code, but this comes at a significant overhead for small problem
sizes. I probably need to apply some kind of heuristic to switch between
plain ole single-threaded code (for small problems where the overhead of
parallel computations outweigh the benefits -- I'm seeing 2x slowdown
for smaller problem sets), and the parallel version for larger problems
where the overhead of multithreading is dwarfed by the overall
performance gain.

T

-- 
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next. -- (Stolen from the net)