problem with parallel foreach
John Colvin via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed May 13 07:43:49 PDT 2015
On Wednesday, 13 May 2015 at 14:28:52 UTC, Gerald Jansen wrote:
> On Wednesday, 13 May 2015 at 13:40:33 UTC, John Colvin wrote:
>> On Wednesday, 13 May 2015 at 11:33:55 UTC, John Colvin wrote:
>>> On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote:
>>>> On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole
>>>> wrote:
>>>>> On 13/05/2015 4:20 a.m., Gerald Jansen wrote:
>>>>>> At the risk of great embarassment ... here's my program:
>>>>>> http://dekoppel.eu/tmp/pedupg.d
>>>>>
>>>>> Would it be possible to give us some example data?
>>>>> I might give it a go to try rewriting it tomorrow.
>>>>
>>>> http://dekoppel.eu/tmp/pedupgLarge.tar.gz (89 Mb)
>>>>
>>>> Contains two largish datasets in a directory structure
>>>> expected by the program.
>>>
>>> I only see 2 traits in that example, so it's hard for anyone
>>> to explore your scaling problem, seeing as there are a
>>> maximum of 2 tasks.
>>
>> Either way, a few small changes were enough to cut the runtime
>> by a factor of ~6 in the single-threaded case and improve the
>> scaling a bit, although the printing to output files still
>> looks like a bit of a bottleneck.
>>
>
>> http://dpaste.dzfl.pl/80cd36fd6796
>>
>> The key thing was reducing the number of allocations (more
>> std.algorithm.splitter copying to static arrays, less
>> std.array.split) and avoiding File.byLine. Other people in
>> this thread have mentioned alternatives to it that may be
>> faster/have lower memory usage, I just read the whole files in
>> to memory and then lazily split them with
>> std.algorithm.splitter. I ended up with some blank lines
>> coming through, so i added if(line.empty) continue; in a few
>> places, you might want to look more carefully at that, it
>> could be my mistake.
>>
>> The use of std.array.appender for `info` is just good
>> practice, but it doesn't make much difference here.
>
> Wow, I'm impressed with the effort you guys (John, Rikki,
> others) are making to teach me some efficiency tricks. I guess
> this is one of the strengths of D: its community. I'm studying
> your various contributions closely!
>
> The empty line comes from the very last line on the files,
> which also end with a newline (as per "normal" practice?).
Yup, that would be it.
I added a bit of buffered writing and it actually seems to scale
quite well for me now.
http://dpaste.dzfl.pl/710afe8b6df5
More information about the Digitalmars-d-learn
mailing list