Taking pipeline processing to the next level

Mon Sep 5 20:00:13 PDT 2016

On 5 September 2016 at 20:32, Johannes Pfau via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> Am Mon, 5 Sep 2016 10:21:53 +0200
> schrieb Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org>:
>
>> On 9/5/16 7:08 AM, Manu via Digitalmars-d wrote:
>> > I mostly code like this now:
>> >   data.map!(x => transform(x)).copy(output);
>> >
>> > It's convenient and reads nicely, but it's generally inefficient.
>>
>> What are the benchmarks and the numbers? What loss are you looking
>> at? -- Andrei
>
> As Manu posted this question (and he's working on a color/image library)
> it's not hard to guess one problem is SIMD/vectorization. E.g if
> transform(x) => x + 2; It is faster to perfom 1 SIMD operation on 4
> values instead of 4 individual adds.
>
> As autovectorization is not very powerful in current compilers I can
> easily imagine that complex range based examples can't compete with
> hand-written SIMD loops.
>
> @Manu: Have you had a look at std.parallelism? I think it has some kind
> of parallel map which could provide some inspiration?

I have, but even just chunks() and joiner() can do the trick to a
reasonable extent, but it's still not great. It's definitely not where
I'd like it to be. End-users won't manually deploy these strategies
correctly (or at all), I'd like to see design that enables more
automatic deployment of batch processing. I treat the end-user like a
javascript user; they shouldn't need to do hard work to make proper
use of a lib, that's poor API offering on part of the lib author.