Matrix-type-friendly syntax and more

bearophile bearophileHUGS at lycos.com
Sun Oct 9 11:29:28 PDT 2011


Kenji Hara has just copied in GIT a large patch by Don:
https://github.com/9rnsr/dmd/branches/opDollar

It allows user defined collections usable with this syntax:

x[$-2, y[$-6, $-9], $-2]

See:
http://d.puremagic.com/issues/show_bug.cgi?id=3474


But a syntax like this too is quite important for a user-defined matrix type to be used in numerical code:

m[i..j, k..w]

Similar slicing and dicing of arrays is used all the time in NumPy (array library for Python) code.

This is a comment by Don about it:

> (4) I have NOT implemented $ inside opSlice(), opSliceAssign().
> It could be done, but I believe those language features need work. They don't
> permit multi-dimensional slicing. I think they should be removed, and the
> functionality folded into opIndex.

Isn't it better to improve opSlice() instead of deprecating it?


I don't remember if people appreciate the idea of a stride (like the third step argument of std.range.iota), this too is used in scientific array-oriented code:

m[i..j:2, k..w:3]

---------------

In Python code there is often the need for a good multi-dimensional array type, even in not-scientific code.

In the Python standard library there is a array module, but it's almost a joke, it's 1D, and it's not used much:
http://docs.python.org/library/array.html

Python programmers use the multi-dimensional array of NumPy, it's widely used around the world, it's used by SciPy too (a scientific programming library).

The experience of Python, with its sorely felt lack of a good multi-dimensional array type, the failure of its array module, and the success of NumPy, the problems caused by two precedent incompatible array libraries Numeric (http://people.csail.mit.edu/jrennie/python/numeric/ ) and numarray (http://www.stsci.edu/resources/software_hardware/numarray/numarray.html), tells me that it will be good to have a bare-bones, but efficient multi-dimensional array type. Plus external libraries (not present in Phobos) that use those arrays to implement all the things they want.

I think that's a good tradeoff between the opposed needs of:
- Keeping Phobos of reasonable size (to not increase too much the burden of its management, to not slow down too much its development);
- Avoiding the risk of incompatible multi-dimensional array types. Most code out there is able to build on a common foundation. This avoids duplication (like the creation of Python numarray and Numeric), allows a better focusing of efforts and speeds up the development of a language-wide standard for such arrays;
- Offer a nD array type to the casual D programmer, even one that doesn't want or can't install other libraries. Even some 30-lines long D programs need multi-dimensional arrays, but they often don't need a complex scientific library too (example of a problem: in the preconditions of my functions that take an array of arrays I always have to test the input is not jagged and it is a rectangular matrix. Such test is not necessary for a multi-dimensional array that is never jagged). Putting the bare bones multi-dimensional array type in Phobos allows people to use them with zero other installs.

This multi-dimensional Phobos array type doesn't even need to contain code to invert a matrix or compute determinant, etc. It just needs basic operations like allocation, indexing, multi-dimensional slicing, change of shape, iteration... Everything else is in modules/packages external to Phobos.

I am not suggesting to put a sparse multi-dimensional array type in Phobos. This need is much less common in casual short programs. This is better left to external modules.

Bye,
bearophile


More information about the Digitalmars-d mailing list