Interesting Research Paper on Constructors in OO Languages

Tue Jul 16 15:01:57 PDT 2013

On Tue, Jul 16, 2013 at 06:17:48PM +0100, Regan Heath wrote:
> On Tue, 16 Jul 2013 14:34:59 +0100, Craig Dillabaugh
> <cdillaba at cg.scs.careton.ca> wrote:
> 
> >On Tuesday, 16 July 2013 at 09:47:35 UTC, Regan Heath wrote:
> >
> >clip
> >
> >>
> >>We have class invariants.. these define the things which must be
> >>initialised to reach a valid state.  If we had compiler
> >>recognisable properties as well, then we could have an
> >>initialise construct like..
> >>
> >>class Foo
> >>{
> >>  string name;
> >>  int age;
> >>
> >>  invariant
> >>  {
> >>    assert(name != null);
> >>    assert(age > 0);
> >>  }
> >>
> >>  property string Name...
> >>  property int Age...
> >>}
> >>
> >>void main()
> >>{
> >>  Foo f = new Foo() {
> >>    Name = "test",    // calls property Name setter
> >>    Age = 12          // calls property Age setter
> >>  };
> >>}

Maybe I'm missing something obvious, but isn't this essentially the same
thing as having named ctor parameters?

[...]
> The idea was to /use/ the code in the invariant to determine which
> member fields should be set during the initialisation statement and
> then statically verify that a call was made to some member function
> to set them.  The actual values set aren't important, just that some
> attempt has been made to set them.  That's about the limit of what I
> think you could do statically, in the general case.
[...]

This seems to be the same thing as using named parameters: assuming the
compiler actually supported such a thing, it would be able to tell at
compile-time whether all required named parameters have been specified,
and abort if not. There would be no need for any invariant-based
guessing of what fields are required and what aren't, and no need for
adding any property feature to the language -- the function signature of
the ctor itself indicates what is required, and the compiler can check
this at compile-time. (Of course, actual verification of the ctor
parameters can only happen at runtime -- which is OK.)

This still doesn't address the issue of ctor argument proliferation,
though: if each level of the class hierarchy adds 1-2 additional
parameters, you still need to write tons of boilerplate in your derived
classes to percolate those additional parameters up the inheritance
tree. If a base class ctor requires parameters parmA, parmB, parmC, then
any derived class ctor must declare at least parmA, parmB, parmC in
their function signature (or provide default values for them), and you
must still write super(parmA, parmB, parmC) in order to percolate these
parameters to the base class. If the derived class requires additional
parameters, say parmD, then that's added on top of all of the base class
ctor arguments. And any further derived class will now have to declare
at least parmA, parmB, parmC, parmD, and then tack on any additional
parameters they may need. This is not scalable -- deeply derived classes
will have ctors with ridiculous numbers of arguments.

Now imagine if at some point you need to change some base class ctor
parameters. Now instead of making a single change to the base class, you
have to update every single derived class to make the same change to
every ctor, so that the new version of the parameter (or new parameter)
is properly percolated up the inheritance tree. This defeats the goal in
OOP of restricting the scope of changes to only localized changes. This
is especially bad when you need to add an *optional* parameter to the
base class: you have to do all that work of updating every single
derived class yet most of the code that uses those derived classes don't
even care about this new parameter! That's a lot of work for almost no
benefit. (And you can't get away without doing it either, since a user
of a derived class may at some point want to customize that optional
base class parameter, so *all* derived class ctors must also declare it
as an optional parameter.)

I think my approach of using builder structs with a parallel inheritance
tree is still better: adding/removing/changing parameters to a base
class's builder struct automatically propagates to all derived classes
with no further code change. With the help of mixin templates, the
amount of boilerplate is greatly reduced. And thanks to the use of
typeof(super), you can even shuffle classes around your class hierarchy
without needing to change anything more than the base class name in the
class declaration -- the mixin automatically picks up the right base
class builder struct to inherit from, thus guaranteeing that the
parallel hierarchy is consistent at all times.

The only weakness I can see is that mandatory arguments with no
reasonable default values can't be easily handled. In the simple cases,
you can expand the mixin to allow you to specify builder struct ctors
that have required arguments; but then this suffers from the same
scalability problems that we were trying to solve in the first place,
since all derived classes' builder structs will now require mandatory
arguments to be propagated through their ctors. But I think this
shouldn't be a big problem in practice: we can use Nullable fields in
the builder struct and have the class ctor verify that all mandatory
arguments are present, and throw an error if any arguments are not set
properly.

T

-- 
ASCII stupid question, getty stupid ANSI.