D const design rationale

Sat Jun 23 22:48:14 PDT 2007

Bill Baxter wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> Walter Bright wrote:
>>>> Sean Kelly wrote:
>>>>> Matter of opinion, I suppose.  The C++ design was immediately clear 
>>>>> to me, though it obviously wasn't for others.  I grant that the 
>>>>> aliasing problem can be confusing, but I feel that it is a 
>>>>> peripheral issue.
>>>>
>>>> I don't think it is a peripheral issue. It completely screws up 
>>>> optimization, is useless for threading support, and has spawned 
>>>> endless angst about why C++ code is slower than Fortran code.
>>>
>>> As a programmer, I consider the optimization problem to be a 
>>> non-issue.    Optimization is just magic that happens to make my 
>>> program run faster.
>>
>> Optimization often makes the difference between a successful project 
>> and a failure. C++ has failed to supplant FORTRAN, because although in 
>> every respect but one C++ is better, that one - optimization of arrays 
>> - matters a whole lot. It drives people using C++ to use inline 
>> assembler. They spend a lot of time on the issue. Various proposals to 
>> fix it, like 'noalias' and 'restrict', consume vast amounts of 
>> programmer time. And time is money.
> 
> FORTRAN is also helped by having a fairly standardized ABI that can be 
> called easily from lots of languages, which C++ lacks.  But C has that, 
> and it has also failed to supplant FORTRAN for numeric code.  But I 
> think Sean's right.  A lot of that is just that the language supports 
> things like actual multi-dimensional arrays (by 'actual' I mean 
> contiguous memory rather than pointers to pointers), and mathematical 
> operations on them right out of the box.  Telling a numerics person that 
> C/C++ will give them much better IO and GUI support, but take them a 

But, what if you could get all of that in a language, plus the array performance of Fortran with the 
'safety' of things like invariant... To Walter's point, if a language can handle arrays well (great 
performance), it makes a lot more sense to support things like MD arrays properly in the language, 
and it will make it a lot easier to justify numerical libraries written in D as well. Plus a lot of 
applications that never even touch the FP stack could benefit from great array performance.

D has a dilemma - because of its C lineage and semantics, it can't just ignore the aliasing issue 
(like Fortran does), but in order to be an improvement over C and C++ in all important aspects, it 
should address the issue somehow. IMHO the most logical way (since pointer/reference and data-flow 
analysis probably can't eliminate even a majority of the aliasing issues) is to improve on C++'s 
idea of 'const' and hand the semantic control over to the programmer in a way the compiler can rely on.

> step back in terms of core numerics is like trying to sell a hunter a 
> fancy new gun with a fantastic scope that will let you pinpoint a mouse 
> at 500 yards but -- oh, I should mention it only shoots bb's.
>
> On the other hand, I suspect there's lots of code that's written in 
> FORTRAN supposedly for performance reasons that doesn't really need to 
> be.  Just as there's lots of code written in C++ that would perform fine 
> in a scripting language.  But people will still swear up and down that 
> {whatever} is the only language fast enough.  A lot of numerics folks do 
> realize this, however.  They just go from Fortran straight to Matlab, 
> and skip the other compiled languages altogether.
> 
> I guess what I'd like to say in summary is that I'm skeptical about the 
> claim that optimization "often" makes the difference between success and 
> failure.  "occasionally" I could believe.  Ill-advised premature 
> optimization has probably led to the demise of many more a project than 
> actual optimization problems in the end product.  We'll all gladly take 
> a free 20% speed improvement if the compiler can give it to us, but I 
> don't believe there are that many projects that will fail simply for 
> lack of that 20%.
> 

<soapbox>
In my experience this is not the case - I've worked on projects where "tuning" took up a significant 
(say ~20%) of the total cost even though the software nominally met the requirements, because 
end-users were not happy with performance. These projects probably would not have strictly "failed" 
because they did the job (slowly), but they would still have been "failures" in the eyes of the 
end-users to a significant degree. I'm sure we've all been there to one degree or another.

Some of these particular issues were taken care of by things like applying databases indexes (that 
are not directly related to what we're talking about here) but some of the issues involved exactly 
what Walter is talking about - how fast the language the application was written in could handle 
arrays of data. In one for example, we had to settle on single precision FP, even though double 
precision met the requirements better, in order to mitigate performance complaints [the perf. issues 
weren't all strictly related to double the data size, but also because the compiler handled singles 
better]. In this case the users wanted the speed rather than the exact sales commissions out to the 
penny every month, but of course the optimal solution would have been both, which a better compiler 
could have provided.

It takes a little extrapolation, but in the end, if the algorithms are correct, too often it can all 
boil down to either the compiler emitting good code, a programmer having to emit it via an 
assembler, or unhappy users.

In another project, the lead decided to do a general "usability" type user-group survey before and 
after "tuning" and there was a big difference that could only be explained by the improved 
performance. The users didn't really recognize that there was a perf. problem until they saw that it 
_could_ be faster, and when they saw the difference, the application became that much more quickly 
accepted. Users care about this stuff, and I've always been dismayed when it is thought of only at 
the end of a project.

Also, one of the advantages of the new design is that most of it doesn't _have_ to be used... If you 
don't need to make an argument 'invariant' just for performance reasons, you don't have to use it. 
Part of the problem with const littering code in C++ I think is because programmers often use it to 
be "const correct" when otherwise it might not make a lot of sense.

The more a language and compiler can do to make performance optimal when it's needed, and offset 
having to hack together shortcuts, workarounds or call into another language API, the better IMHO.
</soapbox>

> --bb