Interesting Research Paper on Constructors in OO Languages

Tue Jul 16 01:17:30 PDT 2013

On Tue, Jul 16, 2013 at 03:54:27AM +0200, Meta wrote:
> On Monday, 15 July 2013 at 22:29:14 UTC, H. S. Teoh wrote:
> >I consider myself to be a "systematic" programmer (according to the
> >definition in the paper); I can work equally well with ctors with
> >arguments vs. create-set-call objects. But I find that mandatory
> >ctors with arguments are a pain to work with, *both* to write and to
> >use.
> 
> I also find constructors with multiple arguments a pain to use. They
> get difficult to maintain as your project grows. One of my pet
> projects has a very shallow class hierarchy, but the constructors of
> each object down the tree have many arguments, with descendants
> adding on even more. It gets to be a real headache when you have
> more than 3 constructors per class to deal with base class
> overloads, multiple arguments, etc.

Yeah, when every level of the hierarchy introduces 2-3 new overloads of
the ctor, you get an exponential explosion of derived class ctors if you
want to account for all possibilities. Most of the time, you just end up
oversimplifying 'cos anything else is simply unmanageable.

[...]
> Having to create other objects to pass to a constructor is
> particularly painful. You'd better pray that they have trivial
> constructors, or else things can get hairy really fast. Multiple
> nested constructors can also create a large amount of code bloat.
> Once the constructor grows large enough, I generally put each
> argument on its own line to ensure that it's clear what I'm calling
> it with. This has the unfortunate side effect of making the call
> span multiple lines. In my opinion, a constructor requiring more
> than 10 lines is an unsightly abomination.

I usually bail out way before then. :) A 10-line ctor call is just
unpalatable.

[...]
> I've found that a good way to keep constructors manageable is to use
> the builder pattern. Create a builder object that has its fields set
> by the programmer, which is then passed to the 'real' object for
> construction. You can provide default arguments, optional arguments,
> etc. Combine this with a fluid interface and I think it looks a lot
> better. Of course, this has the disadvantage of requiring a *lot* of
> boilerplate, but I think this could be okay in D, as a builder class
> is exactly the kind of thing that can be automatically generated.

In my C++ version of this, you could even just reuse the builder object
directly, since it's just a struct containing ctor arguments. But yeah,
there's some amount boilerplate necessary.

[...]
> >In the spirit of this approach, I've written some C++ code in the
> >past that looked something like this:
> >
> >	class BaseClass {
> >	public:
> >		// Encapsulate ctor arguments
> >		struct Args {
> >			int baseparm1, baseparm2;
> >		};
> >		BaseClass(Args args) {
> >			// initialize object based on fields in
> >			// BaseClass::Args.
> >		}
> >	};
> >
> >	class MyClass : public BaseClass {
> >	public:
> >		// Encapsulate ctor arguments
> >		struct Args : BaseClass::Args {
> >			int parm1, parm2;
> >		};
> >
> >		MyClass(Args args) : BaseClass(args) {
> >			// initialize object based on fields in args
> >		}
> >	};
[...]
> See above, this is basically the builder pattern. It's a neat trick,
> giving your args objects a class hierarchy of their own. I think that
> one drawback of that, however, is that now you have to maintain *two*
> class hierarchies. Have you found this to be a problem in practice?

Well, there *is* a certain amount of boilerplate, to be sure, so it
isn't a perfect solution. But nesting the structs inside the class they
correspond with helps to prevent mismatches between the two hierarchies.
It also allows reusing the name "Args" so that you don't have to invent
a whole new set of names just for these builders. Minimizing these
differences makes it less likely to make a mistake and inherit Args from
the wrong base class, for example.

In fact, now that I think of this, in D this could actually work out
even better, since you could just write:

	class MyClass : BaseClass {
	public:
		class Args : typeof(super).Args {
			int parm1 = 1;
			int parm2 = 2;
		}

		this(Args args) {
			super(args);
			...
		}
	}

The compile-time introspection allows you to just write "class Args :
typeof(super).Args" consistently for all such builders, so you never
have to worry about inventing new names or mismatches in the two
hierarchies. The "typeof(super).Args" will automatically pick up the
correct base class Args to inherit from, even if you shuffle the classes
around the hierarchy. Furthermore, since the declaration is exactly
identical across the board (except for the actual fields), you could
just factor this into a mixin and thereby minimize the boilerplate.

The only major disadvantage in the D version is that you can't use
structs, but you have to allocate the Args objects on the GC heap, so
you may end up generating lots of GC garbage. If only D structs had
inheritance, this would've been a much cleaner solution.

> As an aside, you could probably simulate the inheritance of the args
> objects in D either with alias this or even opDispatch. Still, this
> means that you need to nest the structs within each-other, and this
> could get silly after 2-3 "generations" of args objects.

Hmm. This is a good idea! And with a mixin, this may not turn out so bad
after all. Maybe start with something like this:

	class BaseClass {
	public:
		struct Args {
			int baseparm1 = 1;
			int baseparm2 = 2;
			...
		}
	}

	class MyClass : BaseClass {
	public:
		struct Args {
			typeof(super).Args base;
			alias base this;

			int parm1 = 1;
			int parm2 = 2;
			...
		}
		this(Args args) {
			super(args);	// works 'cos of alias this
		}
	}

	void main() {
		MyClass.Args args;
		args.baseparm1 = 2;	// works 'cos of alias this
		args.parm1 = 3;
		auto obj = new MyClass(args);
	}

Using alias this, we have the nice effect that user code no longer needs
to refer to the .base member of the structs, and indeed, doesn't need to
know about it. So this is effectively like struct inheritance... heh,
cool. Just discovered a new trick in D: struct inheritance using alias
this. :)

The boilerplate can be put into a mixin, say something like this:

	mixin template BuilderArgs(string fields) {
		struct Args {
			typeof(super).Args base;
			alias base this;
			mixin(fields);
		}
	};

	class MyClass : BaseClass {
	public:
		// Hmm, doesn't look too bad!
		mixin BuilderArgs!(q{
			int parm1 = 1;
			int parm2 = 2;
		});
		this(Args args) {
			super(args);
			...
		}
	}

	class AnotherClass : BaseClass {
	public:
		// N.B. Looks exactly the same like MyClass.args except
		// for the fields! The template automatically picks up
		// the right base class Args to "inherit" from.
		mixin BuilderArgs!(q{
			string anotherparm1 = "abc";
			string anotherparm2 = "def";
		});
		this(Args args) {
			super(args);
			...
		}
	}

Not bad at all!  Though, I haven't actually tested any of this code, so
I've no idea if it will actually work yet. But it certainly looks
promising! I'll give it a spin tomorrow morning (way past my bedtime
now).

T

-- 
Meat: euphemism for dead animal. -- Flora