GSoC 2012 Proposal: Continued Work on a D Linear Algebra library (SciD - std.linalg)

Tue Apr 24 07:02:14 PDT 2012

On 24 April 2012 08:32, Jens Mueller <jens.k.mueller at gmx.de> wrote:

> Cristi Cobzarenco wrote:
> > Unfortunately, my proposal was not picked up this year. I might try to
> work
> > on these ideas this summer anyway so I would still be very much
> interested
> > in ideas and feedback, but I will probably have much less time if I'll be
> > working somewhere else.
>
> I'm less familiar with your SciD code base but I have used Eigen
> regularly. Maybe you can answer my questions right away:
> 1. How are expression evaluated? Eigen uses no BLAS backend. All
>   code is generated by themselves.
>   Do you plan for such an option?
>

The expressions are built in a similar way to how they are in Eigen (any
many other linear algebra libraries), using expression templates that
build, essentially, abstract syntax trees which get evaluated on assignment
(or with the eval() function). The back-end for evaluation can be specified
using a version flag. Specifying "version=nodeps" results in code being
generated by the library with no external dependencies - this is much
slower, as it doesn't use SIMD operations or anything of the sort.

In the current state of the library, this is done by essentially providing
naive implementations of the BLAS functions. In the revamped version, I
plan this to be done in slightly different way (more similar to the way
Eigen works) which removes the need for temporaries in some cases where
using BLAS/LAPACK makes temporary allocation inevitable. In the immediate
future I do not plan to include SIMD operations for the version=nodeps
version and the library will be the most efficient when using BLAS & LAPACK.

> 2. What is your goal for SciD? Do you want to have an Eigen in D? Are
>   there places where a D solution may improved over the Eigen?
>

These are really two questions:
> 2.1. What is your goal for SciD? Do you want to have an Eigen in D?

Short anwer: no. We already have a working library, it's mostly the
interface that's getting a face-lift. Some of the design decisions we made
early on (such as support for views) and compiler bugs (such as a clash
between templated opAssign and postblit constructor) forced us to make some
unfortunate interface decisions. In the long run it became obvious that
views are pretty much useless and a lot of the compiler bugs that were
hindering progress were fixed - so I thought I would improve the library's
interface.

The more I worked on it, the more obvious it became I was converging onto
Eigen's interface, with bits from MATLAB.

> 2.2. Are there places where a D solution may improved over the Eigen?

I'm not entirely sure. There's the obvious stuff like more natural slicing
syntax which D allows us to do.  Easily swappable backends is another thing
- I'm writing everything to allow custom backends for the matrices (the
backend will be a template parameter in the new version). This would in
theory, allow GPU matrices to be easily written if anyone
wanted (particularly if someone wrapped something like CUBLAS).

I am actually very interested in hearing from other people if they think
there's stuff which we could do better thanks to D.

> 3. I'm not so sure about the array() stuff. I never liked in Eigen.
>   Being able to call std.algorithm on the raw memory may be sufficient
>   for the time being.
>

I am starting to lean against array myself. I wanted mostly to allow for
arbitrary typed, higher-dimensional matrices with element-wise operations.
But having worked a bit on the changes already, I realised that it's very
easy to disallow certain matrix operations on types which do not support
multiplication for example. Also I think I found an elegant way of
supporting higher-dimensional matrices using the same type, but I have to
do a bit more research into that.

In terms of running std.algorihtm on raw memory, that is and will be
possible. Matrices provide two properties .data and .cdata, which provide a
pointer to the raw memory (.cdata provides a constant one which may point
to memory shared by multiple matrices, while .data makes sure the memory is
unique).

> 4. How about a sparse storage backend for sparse matrices? I'm missing
>   sparse matrices in Eigen even though the situation is improving but
>   they're not fully integrated yet.
>

Definitely. Sparse storage is a long term goal for the library, but not in
the first iteration.

>
> I'd like to support your efforts for including your work in Phobos. How
> about you clone Phobos and gradually move your work into a work in
> progress branch? Really little steps. This allows me to follow you
> closely. To familiarize myself with the code and I would then fill in
> unittest, documentation and benchmarking code where missing as we go.
>

That is actually a better idea than what I had in mind. I will do that. I
am really glad you're interested in the project, but unfortunately I won't
be able to start work on the library until the beginning of June. Also the
proposal for the project was meant as my full-time job this summer, but
since I'll probably be working somewhere else now you should take into
account that I probably won't be able to stick to the schedule I gave in
the proposal.

>
> Jens
>
> PS
> It's great to hear that you plan for including your work in Phobos.
>

Not trying to include it into Phobos was a mistake that we made last year.
It should have been this way from the beginning. Thanks for showing
interest and I'll keep you posted once I start working, I hope I answered
your questions satisfactorily, if not let me know.

---
Cristi Cobzarenco
BSc in Artificial Intelligence and Computer Science
University of Edinburgh
Profile: http://www.google.com/profiles/cristi.cobzarenco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20120424/955a1d38/attachment-0001.html>