D const design rationale

Sat Jun 23 08:32:36 PDT 2007

Walter Bright wrote:
> Sean Kelly wrote:
>> However, my point was that to a programmer, optimization should be 
>> invisible.
> 
> I can't agree with that. The compiler and the programmer need to 
> cooperate with each other in order to produce fast programms. If a 
> programmer just throws the completed program "over the wall" to the 
> optimizer and expects good results, he'll be sadly disappointed.

I suppose the issue is where the line should be drawn, and that line is 
constantly shifting.  Previously, "register" was actually a valuable 
tool in C/C++, now it's a pointless anachronism.  Techniques for 
producing optimal loops varied across compiler and platform--in some 
cases using pointers was better, in some cases array indexing was 
better--and now those are largely pointless as well.  But I think there 
are a few different factors to be considered here.  A significant one is 
that compilers have simply gotten better.  Another is that hardware has 
improved tremendously.  But it's interesting that you mentioned FORTRAN 
as being the language to beat for numerics performance because it's 
positively ancient and, to my knowledge, does not require the programmer 
to make any extra effort to produce such optimal code.  It's just a 
side-effect of the language design.

I'm sure you recognize this because D has 'foreach', which is as much an 
optimization tool as it is a programming aid.  The thing is, 'foreach' 
would be great from a programmer perspective even if it were the bane of 
compiler optimization.  It clarifies loop syntax, reduces errors, and 
adds a great deal of flexibility for iteration.  These are sort of 
features I want to see in D: those that make code more elegant and 
maintainable, and which produce optimal code simply as a side-effect of 
their design.  I suppose this is one reason I'm not quite ready to give 
up on "const by default" yet.  Converting code may be more difficult 
than with the current design, but the result seems to have the potential 
to be both cleaner and more suited to deep compiler optimization.

>> I can appreciate that 'invariant' may be of tremendous use to the 
>> compiler, but I balk at the notion of adding language features that 
>> seem largely intended as compiler "hints."
> 
> It's a lot more than that. First, there's the self-documenting aspect of 
> it. Second, it opens the way for functional programming, which can be of 
> huge importance.

Could you explain?  I've been trying to determine how I would write a 
library using these new features, and so far, 'invariant' simply seems 
more an obstacle to maintainability (because of the necessary code 
duplication), than it is an aid to programming.

>> Rather, it would be preferable if the language were structured in such 
>> a way as to make such hints unnecessary.
> 
> That would be preferable, but experience with such languages is that 
> they are complete failures at producing fast code. Fast code comes from 
> a language that enables the programmer to work with the optimizer to 
> produce better code.

Really?  I thought that many functional programming languages were quite 
fast.  And FORTRAN seems a suitable example for an imperative language 
which is quite fast as well.  To my knowledge, none of these contain 
features specifically intended for optimizing code.

> Lots of people think that just 'cuz they write in C++, which has good 
> optimizers available, they'll get fast code. That's not even close to 
> being true.

Of course not.  Garbage in, garbage out.

>> To that end, and speaking as someone who isn't primarily involved in 
>> numerics programming, my impression of FORTRAN is that the language is 
>> syntactically suited to numerics programming, while C++ is not.  Even 
>> if C++ performed on par with FORTRAN for similar work (and Bjarne 
>> suggested last year that it could), I would likely still choose 
>> FORTRAN over C++ because the syntax seems so much more appealing for 
>> that kind of work.
> 
> I programmed for years in FORTRAN. The syntax is not appealing, in fact, 
> it sucks. The reason it is suited for numerics programming is because 
> FORTRAN arrays, by definition, cannot be aliased. This means that 
> optimizers can go to town parallelizing array operations, which is a 
> big, big deal for speed.
> 
> You can't do that with C/C++ because arrays can be aliased. I have a 
> stack of papers 6" deep in the basement on trying to make C more 
> suitable for numerics, and it's mostly about, you guessed it, fixing the 
> alias problem. They failed.

That's the big secret?  Weird.  For some reason I assumed it was a bit 
less specific.

> There are some template libraries for C++ which parallelize array 
> operations and can finally approach FORTRAN. But they are horrifically 
> kludgy and complicated. No thanks. But the effort that has gone into 
> them is astonishing, and is indicative of how severe the problem is.

Yup.  And I agree that this isn't the proper route to follow.  Heck, 
it's one reason I've lost interest in C++.

>>> I'm less sure about that. I think we're all so used to C++ and its 
>>> mushy concept of const that we don't know yet what will emerge from 
>>> the use of invariant. I do know, however, that those who want to do 
>>> advanced array optimizations are going to want to be using invariant 
>>> function parameters.
>>
>> You may be right, and I'm certainly willing to give it a try.  This is 
>> simply my initial reaction to the new design, and I wanted to voice it 
>> before becoming placated by experience.  My gut feeling is that a 
>> better design is possible, and I'm not yet ready to close the door on 
>> alternatives.
> 
> Andrei, I and Bartosz have each expended probably a hundred hours trying 
> to figure this out, and we've tried a lot of designs. If there is a 
> better design, it's not like we haven't tried. I wish to reiterate that 
> const designs in other languages like C++ (and to some extent Java) 
> utterly fail at the objectives we set for const in D. Furthermore, 
> Andrei is familiar with the many research papers on the topic, which 
> were of invaluable help. D's const system is more ambitious than any of 
> them.

I really do appreciate the effort which you all have made to find a 
solid design, and perhaps I simply don't have enough experience with it 
to feel comfortable with it yet.  As you've no doubt noticed, my issue 
is with 'invariant' from a conceptual and a code maintenance standpoint. 
  On the one hand we have 'const' and on the other we have 'really 
really const'.  It just sticks in my craw that we need two separate 
keywords for what seem to be nearly identical properties, and that I may 
have to write separate code to suit each, etc, not to mention trying to 
explain all of it to a new programmer.

>> The compiler can inspect the code however, and a global const is as 
>> good as an invariant for optimization (as far as I know).  As for the 
>> rest, I think the majority of remaining cases aren't ones where 
>> 'invariant' would apply anyway: dynamic buffers whose contents are 
>> guaranteed not to change either in word or by design, etc.
> 
> But they do apply - that's the whole array optimization thing. You're 
> not just going to have global arrays.

Perhaps I misunderstood the "must be known at compile-time" clause.  Can 
'invariant' apply to dynamic arrays that will remain unchanged once 
initialized?  In my work I use almost no static data--it's all generated 
on the fly or loaded from some data source.  Will 'invariant' help to 
make my code more optimal?

Sean