Getting the const-correctness of Object sorted once and for all

Sun May 13 22:34:44 PDT 2012

On 14-05-2012 06:18, Jonathan M Davis wrote:
> On Monday, May 14, 2012 05:47:54 Alex Rønne Petersen wrote:
>> It's kinda funny that something which on dlang.org's FAQ is described as
>> a compiler optimization hint (http://dlang.org/const-faq.html#const)
>> isn't useful at all for this purpose...
>
> Oh, it's definitely useful for optimizations. It just doesn't work with a
> couple of idioms that some programmers like to use in order to optimize their
> code.
>
> - Jonathan M Davis

I have yet to see any compiler make sensible use of the information 
provided by both C++'s const and D's const.

const in particular is completely useless to an optimizer because it 
does not give it any information that it can use for anything. The kind 
of information that an optimization pass, in general, wants to see is 
whether something is guaranteed to *never* change. const does not 
provide this information. const simply guarantees that the code working 
on the const data cannot alter it (but at the same time allows *other* 
code to alter it), which, as said, is useless to the optimizer.

immutable is a different story. immutable actually opens the door to 
many optimization opportunities exactly because the optimizer knows that 
the data will not be altered, ever. This allows it to (almost) 
arbitrarily reorder code, fold many computations at compile time, do 
conditional constant propagation, dead code elimination, ...

This seems reasonable. But now consider that the majority of functions 
*are written for const, not immutable*. Thereby, you're throwing away 
the immutable guarantee, which is what the *compiler* (not the 
*programmer*) cares about. immutable is an excellent idea in theory, but 
in practice, it doesn't help the compiler because you'd have to either

a) templatize all functions operating on const/immutable data so the 
compiler can retain the immutable guarantee when the input is such, or
b) explicitly duplicate code for the const and the immutable case.

Both approaches clearly suck. Templates don't play nice with 
polymorphism, and code duplication is...well...duplication. So, most of 
druntime and phobos is written for const because const is the bridge 
between the mutable and immutable world, and writing code against that 
rather than explicitly against mutable/immutable data is just simpler. 
But this completely ruins any opportunity the compiler has to optimize!

(An interesting fact is that even the compiler engineers working on 
compilers for strictly pure functional languages have yet to take full 
advantage of the potential that a pure, immutable world offers. If 
*they* haven't done it yet, I don't think we're going to do it for a 
long time to come.)

Now, you might argue that the compiler could simply say "okay, this data 
is const, which means it cannot be changed in this particular piece of 
code and thus nowhere else, since it is not explicitly shared, and 
therefore not touched by any other threads". This would be great if 
shared wasn't a complete design fallacy. Unfortunately, in most real 
world code, shared just doesn't cut it, and data is often shared between 
threads without using the shared qualifier (__gshared is one example).

shared is another can of worms entirely. I can list a few initial 
reasons why it's unrealistic and impractical:

1) It is extremely x86-biased; implementing it on other architectures is 
going to be...interesting (read: on many architectures, impossible at 
ISA level).
2) There is no bridge between shared and unshared like there is for 
mutable and immutable. This means that all code operating on shared data 
has to be templatized (no, casts will not suffice; the compiler can't 
insert memory barriers then) or code has to be explicitly duplicated for 
the shared and unshared case. Funnily, the exact same issue mentioned 
above for const and immutable!
3) It only provides documentation value. The low-level atomicity that it 
is supposed to provide (but doesn't yet...) is of extremely questionable 
value. In my experience, I never actually access shared data from 
multiple threads simultaneously, but rather, transfer the data from one 
thread to another and use it exclusively in the other thread (i.e. 
handing over the ownership). In such scenarios, shared just adds 
overhead (memory barriers are Bad (TM) for performance).

/rant

-- 
- Alex