Built-in array ops

Wed Jul 25 18:03:38 PDT 2012

This is useful Mathlab code (from Expokit, it's part of code to 
compute exponents of sparse matrices coming from Markov chains):
http://www.maths.uq.edu.au/expokit/matlab/mexpv.m

It's related to this Fortran code:
http://www.maths.uq.edu.au/expokit/fortran/dmexpv.f

If I have to write minimally serious D code to implement 
something similar to that Mathlab or Fortran code, like this:

while t_now < t_out
   istep = istep + 1;
   t_step = min( t_out-t_now,t_new );
   V = zeros(n,m+1);
   H = zeros(m+2,m+2);

   V(:,1) = (1/beta)*w;
   for j = 1:m
      p = A*V(:,j);
      for i = 1:j
         H(i,j) = V(:,i)'*p;
         p = p-H(i,j)*V(:,i);
      end;
      s = norm(p);
      if s < btol,
         k1 = 0;
         mb = j;
         t_step = t_out-t_now;
         break;
      end;
      H(j+1,j) = s;
      V(:,j+1) = (1/s)*p;
   end;
   if k1 ~= 0,
      H(m+2,m+1) = 1;
      avnorm = norm(A*V(:,m+1));
   end;

I use a library that implements efficient rectangular and sparse 
matrices and some useful functions like zeros() and more, so the 
Mathlab code like this:

p = p-H(i,j)*V(:,i);

becomes (assuming Don's proposal to introduce multiple 
user-defined dollars):

p -= H(i, j) * V(0..$, i);

replacing ":" with "0..$" introduces more noise but it's probably 
acceptable. In such code I use only such library-defined data 
structures, so built-in array ops are never used.

Maybe you want to use built-in array ops in casual D code, that 
doesn't want to import a whole good matrix library, but in 
practice so far in my D2 code I have needed vector ops only quite 
rarely (while I have several time desired a sparse matrix 
library, for simulations and more).

In well known C++ libraries templates are used in complex ways to 
remove the creation of intermediate arrays in long vectorized 
expressions. D built-in vector ops are able to do that, but when 
performance is important I am probably already using a 
matrix/vector library (also for flexibility, because it supports 
sparse arrays, disk-mapped very large tensors, and so on). I 
don't remember people using those template metaprogramming tricks 
in D.

While I like built-in array ops, what are their use cases to 
justify their presence in the language? Has someone used them a 
lot so far? Is deprecating them going to damage future D in some 
way?

Isn't it better to introduce in D syntax (like the multiple $ 
from Don) that allows future D programmers to implement highly 
efficient matrix libraries with a clean syntax for the final 
user? "Highly efficient" may mean offer built-in ways to avoid 
the creation of intermediate library-defined arrays in complex 
expressions, and maybe it also means offering built-in ways to 
help the management of the hierarchical nature of the memory of 
future computing systems, as done in Chapel language.

This is one case where I agree with Andrei that giving language 
tools to build good libraries is probably better than putting 
things in the language itself. As example I think that built-in 
tuples are 200 times more useful/handy than built-in vector ops, 
despite I write mixed scientific D2 code that is using arrays a 
lot. Also, currently D vector ops are in many cases less 
efficient than normal loops, for all but the largest 1D dense 
arrays.

Bye,
bearophile