A gentle critque..

Wed May 17 07:13:12 PDT 2006

Paulo Herrera wrote:
> Walter Bright wrote:
>> Paulo Herrera wrote:
>>> Walter Bright wrote:
>>>> Don Clugston wrote:
>>>>
>>>>>> About native libraries
>>>>>> ----------------------
>>>>>> I think the way to go is to create QUALITY native D libraries. I 
>>>>>> think we usually overestimate the cost of porting/developing 
>>>>>> native libraries.
>>>>>
>>>>> I agree. When porting a library
>>>>> (a) there isn't any algorithm development (not much 'thinking' time 
>>>>> required);
>>>>> (b) it's very easy to test (you can run test cases against the 
>>>>> original version);
>>>>> (c) the D version is frequently greatly superior to the original.
>>>>
>>>> Hmm. Is there a 'canonical' Fortran numerics library with a solid 
>>>> test suite that could be translated to D as a showcase project?
>>>
>>> Do you have some idea in mind?
>>
>> I am not familiar with the various Fortran libraries out there, so I 
>> have no specific one in mind.
>>
>>
>>> If we can make a port of that library to D and show that it performs 
>>> close to  SPARSKIT, that would be a good demonstration of D 
>>> capabilities for numerical computing.
>>
>> Want to get started on it? <g>
> OK, I take the challenge. However, I'm pretty new to D so I'd like to 
> ask some questions before starting.
> 
> As someone else posted, most of numerical libraries in Fortran and C/C++ 
> are based on BLAS and LAPACK. So, a first logic step is to evaluate if 
> we should port those libraries to D. This weekend I took a look to the 
> specification of those libraries (http://www.netlib.org/blas/). After 
> going through the document I'm not sure what is the best way to write a 
> similar library in a "D way". I will try to explain what I mean by "D 
> way" ....
> I'm sure many people are familiar with those libraries, but I include a 
> brief explanation below to make my question clear.
> 
> Those libraries define functions to compute basic matrix/vector 
> operations, such as:
> - x = y, where x and y are vectors
> - R = aAx, where A and R are matrices, a is a scalar, and x is a vector
> - r = aAx + by; where a and b are scalars, A is a matrix, and r, x and y 
>  are vectors.
> 
> Since they were originally developed as a specification for old 
> Fortran77 and C, the declaration of the routines that implement those 
> operations look like:
> - x = y    => void copy(T *x, T *y)
> - R = aAx  => void mult0(T a, T *A, T *x, T *r), etc.
> 
> Those declarations are not elegant and they are not easy to use. My 
> first question is: Should we write a D library in that way to get best 
> performance? I hope the answer is no, because that would be an advantage 
> of D over other languages.
> I believe we should overwrite operators to make the notation as natural 
> as possible. This is something that the designers of BLAS also faced 
> about Fortran95 that includes array operations: "Some of the functions 
> ... can be replaced by simple array expressions and assignments in 
> Fortran95, without loss of convenience or performance(assuming a 
> reasonable degree of optimization by the compiler)...." (in "Basic 
> Linear Algebra Subprograms Technical (BLAST) Forum Standard", 2001, p. 26)
> 
> My questions are:
> 1) Do you think it's possible to write vector and matrix classes with 
> overloaded operators that perform as well as the primitive BLAS 

Not right now probably, but...

> operations? What about temporary objects?

...check out this thread regarding temporaries:

http://www.digitalmars.com/d/archives/digitalmars/D/35949.html

This hasn't been implemented (yet), but given Walter's comment looks 
promising. I'm assuming you agree this would be a great idea too.

> 2) Do you think some of those operations can be accelerated by 
> implementing them as part of the language (as described in future work 
> in D) If yes, is there any time frame for those changes?
> 3) Since the type T on those expressions can change, what is the best 
> way to implement those function without loosing performance?  templates? 

Yes.

>   Since in general people will use only one data type in their programs, 
> could we use typedef or alias to get better performance?

Typedef or alias shouldn't make a difference, but may be a convenience.

> 4) Is there a real interest for this kind of libraries?

I'm not a numerical programmer myself, so I'll gladly defer to the 
opinions of others, but I think there would be interest because there 
have been quite a few in that field participating in these NG's. There 
really seems to be great interest in a curly-brace language that "does 
numerics right" <g>

> 
> Thanks,
> Paulo.

BTW - I downloaded this code to get an idea of 'raw' matrix performance:

http://rs.cipr.uib.no/mtj/bench/JAVAGEMM.html

and modified it to gather an avg.

Here's what I get on a P4 2.2GHz, 512 KB L2, 512 MB RAM running FC5:

GDC v0.17, using dmd v0.140 w/ GCC v4.0.2 (GDC is the D port for GCC)
gdc -O3 -fomit-frame-pointer -funroll-loops -frelease dgemm.d -o dgemm
Max N = 500 avg. mfs: 632.507
Max N = 200 avg. mfs: 905.185

DMD v0.157
dmd -O -inline -release
Max N = 500 avg. mfs: 418.852
Max N = 200 avg. mfs: 485.995

Sun Java v1.5.0
java -server
Max N = 500 avg. mfs: 596.748
Max N = 200 avg. mfs: 797.121

DMD closes the gap where N > 300 and cache size is the bottleneck. In 
general, DMD does very well at integer stuff and I'm assuming floating 
point has yet to be fully optimized.

- Dave