Taking pipeline processing to the next level
finalpatch via Digitalmars-d
digitalmars-d at puremagic.com
Tue Sep 6 19:00:39 PDT 2016
On Wednesday, 7 September 2016 at 01:38:47 UTC, Manu wrote:
> On 7 September 2016 at 11:04, finalpatch via Digitalmars-d
> <digitalmars-d at puremagic.com> wrote:
>>
>> It shouldn't be hard to have the framework look at the buffer
>> size and choose the scalar version when number of elements are
>> small, it wasn't done that way simply because we didn't need
>> it.
>
> No, what's hard is working this into D's pipeline patterns
> seamlessly.
The lesson I learned from this is that you need the user code to
provide a lot of extra information about the algorithm at compile
time for the templates to work out a way to fuse pipeline stages
together efficiently.
I believe it is possible to get something similar in D because D
has more powerful templates than C++ and D also has some type
introspection which C++ lacks. Unfortunately I'm not as good on
D so I can only provide some ideas rather than actual working
code.
Once this problem is solved, the benefit is huge. It allowed me
to perform high level optimizations (streaming load/save,
prefetching, dynamic dispatching depending on data alignment
etc.) in the main loop which automatically benefits all kernels
and pipelines.
More information about the Digitalmars-d
mailing list