How to store data when using parallel processing

Sat Aug 26 18:58:04 PDT 2017

On Sunday, August 27, 2017 00:26:33 Andrew Chapman via Digitalmars-d-learn 
wrote:
> Hi all, just wanting some advice on parallel processing and
> specifically how to deal with access violations.
>
> I am reading a list of words from a file like this:
>
> auto fileHandle = File("wordlist.txt", "r");
>
> string word;
> string[] words;
> string[ulong] hashMap;
>
> while ((word = fileHandle.readln()) !is null) {
>   words ~= word;
> }
>
> Then I'm doing some processing on the words.  I want to make this
> run as quickly as possible so I am doing the processing across
> the cores of my CPU like this:
>
> foreach (thisWord; parallel(words)) {
>   string wordLower = thisWord.strip().toLower();
>   ulong key = keyMaker.createKeyForWord(wordLower);
>
>          // hashMap[key] = wordLower;
> }
>
> The question is, in the above loop, how can I make the commented
> out line work without having an access violation.  Do I need to
> use a different data structure?  Or rethink what I'm doing?

std.parallelism is designed to work on stuff that's truly parallel, whereas
adding a value to an AA is not. For multiple threads to be able to access
it, it needs to be protected by a mutex or a synchronized block, in which
case the assignment is no longer parallel. Whether that matters much depends
on how expensive the rest of what the loop is doing is. If it's cheap
enough, then the threads will all just end up blocking on the mutex,
effectively making the code serial and making the parallelism pointless,
whereas if it's expensive enough that the threads will spend most of their
time doing the rest of the loop, then you can get an increase in performance
over just doing it in one thread. It wouldn't surprise me if simply doing
all of the work in the first loop and not creating the words array in the
first place would be faster than trying to parallelize the second loop, but
I don't know. You'd have to test it and see.

But std.parallelism cheats with regards to shared (which is part of why it's
@system). hiding the fact that you're dealing with multiple threads, but all
of the issues when using shared data remain (e.g. needing to use mutexes to
protect against data races). std.parallelism just normally manages to avoid
that problem by operating on separate pieces of data simultaneously rather
than operating on the same data on multiple threads at the same time,
whereas what you're doing with regards to the result involves operating on
the same data from multiple threads at the same time, which doesn't work
without mutexes.

An alternative would be to have a separate AA per thread, and then you
combine them after the loop, but that requires checking which thread you're
on so that you grab the correct one in a particular iteration of the loop. I
assume that std.parallelism provides a reasonably easy way to do that, but I
haven't done much with it, so I don't know. Worst case, you can probably go
off of the thread ID using core.thread.

std.parallelism may offer other ways to accomplish this, but I'd have to
study it to be sure. Either way, fundamentally, you're either going to have
to protect the hash table with a mutex, and the writes will be synchronous
even if the data creation is parallelized, or you have to store the data
from each thread separately while the threads are running and then combine
the data when the threads are done.

- Jonathan M Davis