Taking pipeline processing to the next level

Sun Sep 4 22:17:36 PDT 2016

On 05/09/2016 5:08 PM, Manu via Digitalmars-d wrote:
> I mostly code like this now:
>   data.map!(x => transform(x)).copy(output);
>
> It's convenient and reads nicely, but it's generally inefficient.
> This sort of one-by-one software design is the core performance
> problem with OOP. It seems a shame to be suffering OOP's failures even
> when there is no OOP in sight.
>
> A central premise of performance-oriented programming which I've
> employed my entire career, is "where there is one, there is probably
> many", and if you do something to one, you should do it to many.
> With this in mind, the code I have always written doesn't tend to look
> like this:
>   R manipulate(Thing thing);
>
> Instead:
>   void manipulateThings(Thing *things, size_t numThings, R *output,
> size_t outputLen);
>
> Written this way for clarity. Obviously, the D equiv uses slices.
>
> All functions are implemented with the presumption they will operate
> on many things, rather than being called many times for each one.
> This is the single most successful design pattern I have ever
> encountered wrt high-performance code; ie, implement the array version
> first.
>
> The problem with this API design, is that it doesn't plug into
> algorithms or generic code well.
>   data.map!(x => transformThings(&x, 1)).copy(output);
>
> I often wonder how we can integrate this design principle conveniently
> (ie, seamlessly) into the design of algorithms, such that they can
> make use of batching functions internally, and transparently?
>
> Has anyone done any work in this area?
>
> Ideas?

Just a random idea:

import std.array : front, popFront, empty;
import std.range : ElementType, isInputRange;
int[] transformer(I)(I from, int[] buffer) if (is(ElementType!I == int) 
&& isInputRange!I) {
	size_t used;

	// ... transformation algo

	return buffer[0 .. used];
}

auto got = input.transformer(buffer).usage;

Input range instead of straight array being passed in so it works on 
pretty much any input arrays or input ranges.