DIP80: phobos additions

Sun Jun 14 12:09:31 PDT 2015

On Sunday, 14 June 2015 at 18:49:21 UTC, Ilya Yaroshenko wrote:
> Yes, but it would be hard to create SIMD optimised version.

Then again clang is getting better at this stuff.

> What do you think about this chain of steps?
>
> 1. Create generalised (only type template and my be flags) BLAS 
> algorithms (probably  slow) with CBLAS like API.
> 2. Allow users to use existing CBLAS libraries inside 
> generalised BLAS.
> 3. Start to improve generalised BLAS with SIMD instructions.
> 4. And then continue discussion about type of matrixes we 
> want...

Hmm… I don't know. In general I think the best thing to do is to 
develop libraries with a project and then turn it into something 
more abstract.

If I had more time I think I would have made the assumption that 
we could make LDC produce whatever next version of clang can do 
with pragmas/GCC-extensions and used that assumption for building 
some prototypes. So I would:

1. protoype typical constructs in C, compile it with next version 
of llvm/clang (with e.g. 4xloop-unrolling and try different 
optimization/vectorizing options) the look at the output in LLVM 
IR and assembly mnemonic code.

2. Then write similar code with hardware optimized BLAS and 
benchmark where the overhead between pure C/LLVM and BLAS calls 
balance out to even.

Then you have a rough idea of what the limitations of the current 
infrastructure looks like, and can start modelling the template 
types in D?

I'm not sure that you should use SIMD directly, but align the 
memory for it. Like, on iOS you end up using LLVM subsets because 
of the new bitcode requirements. Ditto for PNACL.

Just a thought, but that's what I would I do.