RFC: naming for FrontTransversal and Transversal ranges

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat May 2 07:58:08 PDT 2009


Robert Jacques wrote:
> I do scientific computing. Generally, I find it breaks down into two 
> parts: things under 4x4, for which value types are probably better, and 
> everything else, for which value types are to be avoided like the 
> plague. I'll often work with 100's mb of data with algorithms that take 
> minutes to hours to complete. So an unexpected copy is both hard to find 
> (am I slow/crashing because of my algorithm, or because of a typo?) and 
> rather harmful, because its big.

I don't buy this. Undue copying is an issue that manifests itself 
locally, reproducibly, and debuggably. Contrast with long-distance 
coupling which is bound to hard to debug. You change a matrix here, and 
all of a sudden a matrix miles away has been messed up. Also, efficiency 
can be fixed with COW, whereas there is nothing you can do to fix the 
coupling aside from relentless and patient user education.

Walter gave me a good argument (little did he know he was making a point 
destroying his.) Consider the progress we made when replacing char[] 
with string. Why? Because with char[] long-distance dependencies crop up 
easy and fast. With string you know there's never going to be a 
long-distance dependency. Why? Because unlike char[], content 
immutability makes string as good as a value.

I remember the nightmare. I'd define a little structure:

struct Sentence
{
     uint id;
     char[] data;
}

Above my desk I have a big red bulb along with an audible alarm. As soon 
as I add the member "data", the bulb and the alarm go off. Sentence is 
now an invalid struct - I need to add at least constructor and a 
postblit. In the constructor I need to  call .dup on the incoming data, 
and in the postblit I need to do something similar (or something more 
complicated if I want to be efficient). This is a clear example of code 
that is short and natural, yet does precisely the wrong thing. This is 
simply a ton of trouble, as experience with C++ has shown.

I'm not even getting into calling functions that take a char[] and 
keeping fingers crossed ("I hope they won't mess with it") or .dup-ing 
prior to the call to eliminate any doubt (even though the function may 
anyway call .dup internally). string has marked huge progress towards 
people considering D seriously.

> But I've generally worked on making 
> something else fast so more data can be crunched, etc. Actual prototype 
> work (for array/matrix based stuff at least) is often done in Matlab, 
> which I think uses COW under-the-hood to provide value semantics. So I 
> think anyone turning to D to do scientific computing will know reference 
> semantics, since they'd already be familiar with them from C/C++, etc 
> (Fortran?). Although successfully attracting algorithm prototypes from 
> Matlab/python/mathmatica/R/etc is probably bigger issue than just the 
> container types, growing the pie was why the Wii won the last console wars.

Fortran uses pass by reference, but sort of gets away with it by 
assuming and promoting no aliasing throughout. Any two named values in 
Fortran can be assumed to refer to distinct memory. Also unless I am 
wrong A = B in Fortran does the right thing (copies B's content into A). 
Please confirm/infirm.

For all I know, Matlab does the closest to "the real thing". Also, C++ 
numeric/scientific libraries invariably use value semantics in 
conjunction with expression templates meant to effect loop fusion. Why? 
Because value semantics is the right thing and C++ is able to express 
it. I should note, however, that Perl Data Language uses reference 
semantics (http://tinyurl.com/derlrh).

There's also a definite stench when one realizes that

a = b;

on one side, and

a = b * 1;

or

a = b + 0;

on the other, do completely different things.

So what we're looking at is: languages that had the option chose value 
semantics. Languages that didn't, well, they did what they could.

I started rather neutral in this discussion but the more time goes by, 
the more things tilt towards value semantics.


Andrei



More information about the Digitalmars-d mailing list