Writing const-correct code in D

Thu Mar 9 08:44:06 PST 2006

In article <dupfcr$erh$1 at digitaldaemon.com>, xs0 says...
>
>Kevin Bealer wrote:
>> Since people want the benefits of const, I'm showing a way to get them
>> by following coding conventions.  This requires *no* changes to D.
>> 
>> 
>> Also, this is not full "C++ const", only parameter passing and const
>> methods, which seems to be the most popular parts of the const idea.
>> It seems like it should require more syntax that C++, but it only
>> takes a small amount.
>> 
>> 
>> When working with types like "int", use "in" - const is not too much
>> of an issue here.
>> 
>> The same is true for struct, it gets copied in, which is fine for
>> small structs.  For larger structs, you might want to pass by "in *",
>> i.e. use "in Foo *".  You can modify this technique to use struct, for
>> that see the last item in the numbered list at the end.
>> 
>> 
>> For classes, the issue is that the pointer will not be modified with
>> the "in" convention, but the values in the class may be.
>> 
>> : // "Problem" code
>> :
>> : class Bar {...}
>> :
>> : class Foo {
>> :   this(Bar b)
>> :   { x1 = b; }
>> :
>> :   this(const_Foo b)
>> :   {
>> :     x1 = b.x1.dup;
>> :   }
>> :
>> :   // Modifies this Foo
>> :   void changeBar(Bar b2)
>> :   { x1 = b2; }
>> :
>> :   // Does not modify this Foo
>> :   int doesWork() {...}
>> :
>> : protected:
>> :   Bar x1;
>> : };
>> :
>> : // NOTE: changes foo1
>> : void barfoo(in Foo foo1, in Bar b)
>> : {
>> :   foo1.changeBar(b);
>> : }
>> 
>> We'd like barfoo() not to modify foo1 - we want to guarantee it.
>> 
>> To deal with this, you can write a "const interface" for your class.
>> I recommend the prefix "const_" so that it looks a little like the C++
>> version.  This interface definition is quite simple to do.  Note that
>> a Foo is-a const_Foo, and passing it to a const-Foo interface is
>> legal.  But modifying it will throw an exception.
>> 
>> NOTE: You don't need any extra method code, except constructors and
>> optionally the "clone()" method.  What we are doing is SPLITTING the
>> personality of Foo into two halves - read and write.
>> 
>> : // The read stuff
>> :
>> : class const_Foo {
>> :   this(Bar b)
>> :   { x1 = b; }
>> :   
>> :   this(Foo b)
>> :   { x1 = b.x1.dup; }
>> :   
>> :   // Does not modify Foo
>> :   int doesWork() {...}
>> :
>> :   Foo clone() // how to un-const (optional)
>> :   {
>> :      return new Foo(this); // use const->nonconst ctor
>> :   }
>> :
>> : protected:
>> :   Bar x1;
>> : };
>> :
>> : // The write stuff - can also do read stuff of course.
>> :
>> : class Foo : const_Foo {
>> :   this(Bar b)
>> :   {
>> :     const_Foo(b);
>> :   }
>> :   
>> :   this(const_Foo b) // const->nonconst ctor
>> :   {
>> :     const_Foo(b.dup);
>> :   }
>> :   
>> :   void changeBar(Bar b2)
>> :   {
>> :     x1 = b2;
>> :   }
>> : };
>> :
>> : // Can only call this with non-const Foo.
>> : void barfoo(in Foo foo1, in Bar b)
>> : {
>> :   foo1.changeBar(b);
>> : }
>> : 
>> : // Can call this with either const_Foo or Foo.
>> : void barfaa(in const_Foo foo1)
>> : {
>> :   int q = foo1.doesWork();
>> : }
>> 
>> 1. In C++, you need to make the same division into const and
>> non-const, since every method must be labeled as "const" or not
>> labeled (and thus unusable in a const object).  So there is no extra
>> "design burden".
>> 
>> 2. You can easily change any method's constness by cut/pasting it to
>> the other class.  All implementation code/data is shared.
>> 
>> 3. Relationships are enforced!  If doesWork() calls changeBar(), the
>> compiler will complain.
>> 
>> 4. The class author decides whether "clone()" and the other special
>> methods are written at all - so if "Bar" is uncloneable for some
>> reason (i.e. maybe its a File), don't write clone() for Foo, or find a
>> way to get around copying it.  This work needs to be done in C++ too.
>> 
>> 5. Users of const_Foo don't need to know what the editable Foo does.
>> Their code can't break unless the const_ side is changed.  It's now
>> very hard to miss the distinction between const/non-const, which is
>> easy to miss in C++ when writing methods for example.
>> 
>> 6. Easy to use as a Copy-On-Write design: If you need to store an
>> object, and don't know if it is const or not, use a const_Foo
>> reference.  In the event you need to modify it, you can test whether
>> it is const with a dynamic cast.  If it is, clone it first!
>> 
>> 7. In C++, you can also define distinct const and non-const methods
>> for a class.  This happens automatically here - the non-const method
>> (if one exists) just overrides the const one.
>> 
>> 8. Finally, for OOD/OOP purists: Although the non-const version is not
>> really "is-a" const, the relationship still holds once you realize
>> that const is really a "subtracting" adjective - we could use the
>> terms readable and read/writeable, where it is easy to see that a
>> read/writeable think is-a readable thing.
>> 
>> 9. You can have "in Foo" parameters and "out const_Foo" without it
>> being a contradiction.  The first means "I don't want to change what
>> it points to -- something the caller might also want to know -- but I
>> might modify it.  The second is a way to return something.  [The
>> semantics of input and output (argument and return value) are normally
>> different in OO programming, since one is covariant and the other
>> contravariant. (This is true in D, right?)]
>> 
>> 10. For structs you can do a similar thing:
>> 
>> : // read-write version
>> : struct X {
>> :    int opIndex(int i) { ... }
>> :    int opIndexAssign(int i) { ... }
>> :    
>> : private:
>> :    int[1024] data_;
>> : };
>> 
>> : // read-only version
>> : struct const_X {
>> :    int opIndex(int i) { return impl[i]; }
>> :    
>> :    X * clone()
>> :    {
>> :       return impl.dup;
>> :    }
>> :    
>> : private:
>> :    X impl;
>> : };
>> 
>> If people like this, maybe something along these lines would be useful
>> for the C++ programmer intro on the D site?  I can make a more
>> thorough version if so.  If people use this technique, it might be
>> good for them to follow the same style, i.e. method names.
>> 
>> Kevin
>
>OK, while it might work, your approach has several problems:
>
>- applicability
>    - there is no support for arrays and pointers to primitive types;
>      especially arrays are a problem

True.  This could be done with a wrapper too, particularly with IFTI, but
coding efficiency suffers a little.

>- coding efficiency
>    - maintenance of an additional class is bad, having to write
>      a wrapper for structs is even worse

C++ requires you to maintain two personalities in one class - you have to make
the same decisions, and can make most of the same errors.

>    - while arbitrarily flexible, the approach is also error-prone,
>      like all manual methods

If you don't write the special methods, there is very little extra work to do,
really just adding "class A : B {" and "};" to your code.  It should not be much
more error prone.  If you do write the special methods (clone, special
constructors) then this is true.

So I agree, partially... see my comments at the end.

>- runtime efficiency
>    - say you have a struct in your object; you can't return a readonly
>      pointer to it, so you either have to heap-allocate a new struct,
>      or make the wrapper contain a pointer, incurring double
>      dereferencing

But, by heap allocating it, you avoid the cost of copying it during the return.
The way I see it, you have three options:

1. Return by value - in which case you probably don't need const, unless the
object contains stuff that you want to protect.

2. Wrap in a readonly struct (no pointers) and return this by value.  The copy
into the readonly struct costs efficiency, about the same as returning the
original by value.

3. Heap allocations... avoids future copies (good), requires heap allocation
(bad).  So, a tradeoff.

I admit though, the approach doesn't work as well for struct as for class.

>    - if you do use a pointer, you have to heap-allocate the wrapped
>      struct, otherwise you can't pass the read-only version around
>      freely

I think you can wrap in a struct and pass the wrapper by value, which copies the
internal struct.  The value is copied as with any struct, but the wrapper
doesn't have the non-const methods, right?

A bigger annoyance is the fact that you can't write one function signature that
takes either type, unless it takes a template parameter.  I.e. structs don't
have inheritance, so there's no Foo -> const_Foo implicit conversion.

>    - two classes means two vtbls, two TypeInfos, ...

Small potatoes, and not entirely harmful (see below).

>Any thoughts on that? :)
>
>
>xs0

You're right - this approach has definite limits and is not free.  However, it
allows some flexibility that C++ (for instance) does not.

:const_Foo f = get_Object();
:
:Foo f_mod = cast(Foo) f;
:
:if (f_mod !is null) {
:   // modify f, only if allowed
:}

With this technique, you can create real "const_Foo" objects (i.e. they were
never a Foo) and test which kind you have at runtime.  Normally all "Foo"
objects are created as "Foo", but this approach enables you to build objects
that are designed to never change - someone has to do the clone operation to get
a modifiable version.

This could be really useful in (for instance) cache designs - get an object from
the cache, pass it anywhere, since the const_Foo class uses "immutable"
semantics (like a Java String), this is safe to do.  It requires classes that
can be built entirely in the constructor, but many people write code like this
now.

Similarly, I could create a Baz object that has all the readonly methods of Foo,
and derive it from const_Foo.  Now you can treat it as a const_Foo, but cannot
cast it to a Foo (since it isnt one).

Why would I do this?  Imagine I have a string class Foo - I could create a
const_Foo derived class where the actual data was a memory mapped file.  It has
all the *readonly* properties of a string, but you can't resize it.

Or a readonly version of a database interface, which would dynamically provide
DB information for user queries, but would not allow storage into the database.
I could use this database interface as a front end for any number of data
sources that are not really databases.

You can't do any of these in C++, since C++'s const acts like a template - all
the work happens at compile time.  Having an extra vtbl, allows you to do
runtime tricks too - and have compile time static type correctness for code that
just uses Foo without thinking about const.

Kevin