D programming practices: object construction order
Denis Koroskin
2korden at gmail.com
Fri Mar 6 12:22:01 PST 2009
It is a long and boring post so you may want to go right to the conclusion.
Consider the following class hierarchy:
class A
{
// fields and virtual methods
};
class B : public A
{
// fields and virtual methods
};
class C : public B
{
// fields and virtual methods
};
Here is what happens when object of type C is constructed in C++:
1) Allocate memory to store C
2) Call A.ctor on object. This initializes vtbl (with pointers to methods of A) and passes control to user code
3) Call B.ctor on object. This initializes vtbl (with pointers to methods of B) and passes control to user code
4) Call C.ctor on object. This initializes vtbl (with pointers to methods of B) and passes control to user code
Let's call it bottom to top object construction.
We are studying good practices now, assume that A, B and C ctor are written well and initialize all of its variables (or *intentionally* leave them uninitialized).
Since base class' ctor is forcibly called prior to running user-code, it is impossible to access uninitialized member at any time in C++:
B::B(/*...*/) : /*...*/
{
// It is enforced by compiler that all of the A members are already initialized by now
// We also can not access any of the C members directly or indirectly (via virtual functions)
}
Now let's take a look at D.
In D, we have a different object construction order:
1) Allocate memory to store C
2) Call C.ctor on object and initialize in with C.init. This includes vtbl intialization with pointers to methods of C.
3) Give control to user-code so that user himself decides when parent classes' ctors need to be run.
So the question is - when do we run parent class ctor: at the beginning or at the end?
Since we are all talking about non-nullable types, we must be sure that they are indeed fully initialized before we access them:
class B
{
this() { foo = new Foo(); }
Foo foo;
}
class C : B
{
// example 1:
this()
{
writefln(foo.toString()); // error, foo is not initialized yet
super();
}
// example 2:
this()
{
super();
writefln(foo.toString()); // fine, foo is initialized
}
}
So here is recommendation №1:
- don't access base class members before base class ctor is run
What about virtual functions? Consider the following example:
class B
{
string toString() { return super.toString() ~ ", B: " ~ foo.toString(); }
Foo foo;
}
class C
{
string toString()
{
return super.toString() ~ ", C: " ~ bar.toString();
}
Bar bar;
// example 1:
this()
{
writefln(toString()); // Dang! foo and bar are not initialized but accessed
bar = new Bar();
super();
}
// example 2:
this()
{
bar = new Bar();
super();
writefln(toString()); // fine, foo is initialized
}
}
So here are recommendation №2:
- initialize all your variables and call super() before you call any member function
A consequence from recommendation 2:
- don't pass 'this' to any function or store globally before you initialize all your variables and call super(). Static methods are ok, because they don't have access to 'this' (unless it is passed as one of the parameters, of course).
What about virtual functions called inside base class ctor? Here is an example:
class B : A
{
Foo foo;
string toString() { return super.toString() ~ ", B: " ~ foo.toString(); }
this()
{
foo = new Foo();
writefln(toString()); // Dang! C.bar is not initialized yet. See below
}
}
class C : B
{
Bar bar;
string toString() { return super.toString() ~ ", C: " ~ bar.toString(); }
this()
{
super();
bar = new Bar();
}
}
And here is a gotcha: since vtbl is constructed differently in D, we have no pure virtual function call errors. But we are able to access variables that are not initialized yet.
Here comes recommendataion №3:
- initialize all you variables *before* you call base class ctor.
Now this is something that is different from C++, different from what we are used to. But this is the way we need to follow to make sure our fields are not accessed before initialized.
Conclusion
----------
Since D follows object construction order different from C++, here is a recommended one:
class Foo : public Bar
{
this()
{
// Initialize all your variables.
// This includes leaving some of them default-initialized on purpose (unless they are non-nullable).
// You shouldn't not call any member fields and functions yet.
super();
// now do something useful (object registration etc)!
// Your object is *fully* and *correctly* constructed by now.
// You may call any functions without any risk of accessing uninitialized members.
}
}
I think this is the only correct way to follow. I even believe that it should be statically enforced by compiler. It should certainly be if we want to see non-nullable types in D one day.
What do you think?
More information about the Digitalmars-d
mailing list