Designing a matrix library for D

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Mon Jun 23 14:32:03 PDT 2014


On Mon, Jun 23, 2014 at 09:08:02PM +0000, Mason McGill via Digitalmars-d wrote:
> On Monday, 23 June 2014 at 16:45:28 UTC, bearophile wrote:
> I'm glad to hear there's interest in this! I've actually been working
> on a generic multidimensional array library myself (on and off) for
> the past few months (I'm hoping to have a well-documented dub package
> by the end of the summer). It's a bit of a hybrid between Julia's
> `AbstractArray`, NumPy's `ndarray`, Numba, and std.range, and I think
> it has a very "D" feel.

You are far from the first one to write a multidimensional array library
in D. Denis has already done this some time ago, and I have, too. It's
clear that we need to get our heads together and knock up something
Phobos-worthy.


> It is, however, designed for processing sampled signals (rather than
> matrix algebra), which may be more or less in line with different
> people's interests.

I've thought about this a lot, and my conclusion is that matrix algebra
is best left to a matrix-specific library wrapper that endows matrix
algebra properties on top of a generic multidimensional array type. The
main reason is that matrix algebra is rather peculiar to matrices
specifically, particularly the non-commutative matrix product, and does
not have very many useful usages outside of matrix algebra itself; yet
multidimensional arrays in general are useful in a wider scope.

So in my mind, there are several components that are needed here:
- A generic multidimensional array, as a data container. This can be a
  set of diverse concrete types, e.g. to represent sparse matrices,
  etc.. No domain-specific operations are specified for these
  containers; they are purely for storage and access only.

- Specialized array types, like matrices which add matrix algebra
  semantics to the underlying multidimensional array storage container
  (e.g., matrix product), or tensor wrappers (endows tensor product
  semantics on the underlying container).

- Algorithms: these take any multidimensional array, or specialized
  matrix type (depending on the purpose of each algorithm), and operate
  on them using a common multidimensional array API. They should be as
  generic as they can be, but no more, e.g., LU decomposition algorithms
  would take a matrix type rather than the general multidimensional
  array type, because it is specific to matrices, whereas per-element
  summation would take any general multidimensional array type, since it
  doesn't need matrix-specific properties. Similarly, subarray
  operations should be able to work with any multidimensional array
  type, since it simply provides a restricted view on the underlying
  storage container, but doesn't depend on the implementational
  specifics of the container itself.


[...]
> I did a lot of thinking about the level of abstraction that would be
> the most useful, and I settled on something like Julia's
> `AbstractArray` because it fits into D nicely ("like ranges, but
> different!") and easily maps onto something that can be passed to
> other languages (a block of memory with size/strides metadata). As in
> Julia, I decided to make rank (`nDim!grid`) a compile-time entity
> since
> 
> 1) Writing rank-branching code for NumPy is cumbersome and error prone; rank
> should be part of overload resolution.
> 2) It speeds up component-programming-style operations without requiring
> leaky abstractions (having to specialize on concrete grid types).
> 
> It's a busy week so I probably won't be able to follow this thread, but I
> wanted to let you know I was working on this to avoid duplicated efforts,
> etc.
[...]

I think the best solution would be a generic multidimensional concept
(analogous to the input range concept) that will fit all of our
multidimensional array implementations. Algorithms that can work with
diverse ranks shouldn't care if the rank of a particular concrete type
is compile-time or runtime, for example. Let the algorithms be
adaptable (up to practicality, of course -- if an algo can't work or
works very poorly with dynamic rank, then just add a sig constraint that
requires static rank, for example), and the user can choose which
concrete type(s) best suits his needs.

The concrete types can then be provided by a separate module, and if we
do it right, should be interoperable with each other.


T

-- 
"How are you doing?" "Doing what?"


More information about the Digitalmars-d mailing list