Taking pipeline processing to the next level

finalpatch via Digitalmars-d digitalmars-d at puremagic.com
Tue Sep 6 19:00:39 PDT 2016


On Wednesday, 7 September 2016 at 01:38:47 UTC, Manu wrote:
> On 7 September 2016 at 11:04, finalpatch via Digitalmars-d 
> <digitalmars-d at puremagic.com> wrote:
>>
>> It shouldn't be hard to have the framework look at the buffer 
>> size and choose the scalar version when number of elements are 
>> small, it wasn't done that way simply because we didn't need 
>> it.
>
> No, what's hard is working this into D's pipeline patterns 
> seamlessly.

The lesson I learned from this is that you need the user code to 
provide a lot of extra information about the algorithm at compile 
time for the  templates to work out a way to fuse pipeline stages 
together efficiently.

I believe it is possible to get something similar in D because D 
has more powerful templates than C++ and D also has some type 
introspection which C++ lacks.  Unfortunately I'm not as good on 
D so I can only provide some ideas rather than actual working 
code.

Once this problem is solved, the benefit is huge.  It allowed me 
to perform high level optimizations (streaming load/save, 
prefetching, dynamic dispatching depending on data alignment 
etc.) in the main loop which automatically benefits all kernels 
and pipelines.



More information about the Digitalmars-d mailing list