Taking pipeline processing to the next level

Tue Sep 6 07:26:22 PDT 2016

On Tuesday, 6 September 2016 at 14:21:01 UTC, finalpatch wrote:
> Then some template magic will figure out the LCM of the 2 
> kernels' pixel width is 3*4=12 and therefore they are fused 
> together into a composite kernel of pixel width 12.  The above 
> line compiles down into a single function invokation, with a 
> main loop that reads the source buffer in 4 pixels step, call 
> MySimpleKernel 3 times, then call AnotherKernel 4 times.

Correction:
with a main loop that reads the source buffer in *12* pixels 
step, call MySimpleKernel 3 times, then call AnotherKernel 4 
times.