Taking pipeline processing to the next level

Mon Sep 5 06:38:52 PDT 2016

On 9/5/16 1:43 PM, Ethan Watson wrote:
> On Monday, 5 September 2016 at 08:21:53 UTC, Andrei Alexandrescu wrote:
>> What are the benchmarks and the numbers? What loss are you looking at?
>> -- Andrei
>
> Just looking at the example, and referencing the map code in
> std.algorithm.iteration, I can see multiple function calls instead of
> one thanks to every indexing of the new map doing a transformation
> instead of caching it. I'm not sure if the lambda declaration there will
> result in the argument being taken by ref or by value, but let's assume
> by value for the sake of argument. Depending on if it's taking by value
> a reference or a value type, that could either be a cheap function call
> or an expensive one.
>
> But even if it took it by reference, it's still a function call.
> Function calls are generally The Devil(TM) in a gaming environment. The
> less you can make, the better.
>
> Random aside: There are streaming store instructions available to me on
> x86 platforms so that I don't have to wait for the destination to hit L1
> cache before writing. The pattern Manu talks about with a batching
> function can better exploit this. But I imagine copy could also take
> advantage of this when dealing with value types.

Understood. Would explicitly asking for vectorized operations be 
acceptable? One school of thought has it that explicit invocation of 
parallel operations are preferable to autovectorization and its ilk. -- 
Andrei