2016Q1: std.blas
Charles McAnany via Digitalmars-d-announce
digitalmars-d-announce at puremagic.com
Sat Dec 26 21:43:47 PST 2015
On Saturday, 26 December 2015 at 19:57:19 UTC, Ilya Yaroshenko
wrote:
> Hi,
>
> I will write GEMM and GEMV families of BLAS for Phobos.
>
> Goals:
> - code without assembler
> - code based on SIMD instructions
> - DMD/LDC/GDC support
> - kernel based architecture like OpenBLAS
> - 85-100% FLOPS comparing with OpenBLAS (100%)
> - tiny generic code comparing with OpenBLAS
> - ability to define user kernels
> - allocators support. GEMM requires small internal allocations.
> - @nogc nothrow pure template functions (depends on allocator)
> - optional multithreaded
> - ability to work with `Slice` multidimensional arrays when
> stride between elements in vector is greater than 1. In common
> BLAS matrix strides between rows or columns always equals 1.
>
> Implementation details:
> LDC all : very generic D/LLVM IR kernels. AVX/2/512/neon
> support is out of the box.
> DMD/GDC x86 : kernels for 8 XMM registers based on core.simd
> DMD/GDC x86_64: kernels for 16 XMM registers based on core.simd
> DMD/GDC other : generic kernels without SIMD instructions.
> AVX/2/512 support can be added in the future.
>
> References:
> [1] Anatomy of High-Performance Matrix Multiplication:
> http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
> [2] OpenBLAS https://github.com/xianyi/OpenBLAS
>
> Happy New Year!
>
> Ilya
I am absolutely thrilled! I've been using scid
(https://github.com/kyllingstad/scid) and cblas
(https://github.com/DlangScience/cblas) in a project, and I can't
wait to see a smooth integration in the standard library.
Couple questions:
Why will the functions be nothrow? It seems that if you try to
take the determinant of a 3x5 matrix, you should get an exception.
By 'tiny generic code', you mean that DGEMM, SSYMM, CTRMM, etc.
all become one function, basically?
You mention that you'll have GEMM and GEMV in your features, do
you think we'll get a more complete slice of BLAS/LAPACK in the
future, like GESVD and GEES?
If it's not in the plan, I'd be happy to work on re-tooling scid
and cblas to feel like std.blas. (That is, mimic how you choose
to represent a matrix, throw the same type of exceptions, etc.
But still use external libraries.)
Thanks again for this!
More information about the Digitalmars-d-announce
mailing list