Interesting Research Paper on Constructors in OO Languages

Thu Jul 18 02:13:58 PDT 2013

On Wed, 17 Jul 2013 18:58:53 +0100, H. S. Teoh <hsteoh at quickfur.ath.cx>  
wrote:
> On Wed, Jul 17, 2013 at 11:00:38AM +0100, Regan Heath wrote:
>> Emphasis on "create-set-call" :)  The weakness to create-set-call
>> style is the desire for a valid object as soon as an attempt can be
>> made to use it.  Which implies the need for some sort of enforcement
>> of initialisation and as I mentioned in my first post the issue of
>> preventing this intialisation being spread out, or intermingled with
>> others and thus making the semantics of it harder to see.
>
> Ah, I see. So basically, you need some kind of enforcement of a
> two-state object, pre-initialization and post-initialization. Basically,
> the ctor is empty, so you allocate the object first, then set some
> values into it, then it "officially" becomes a full-fledged instance of
> the class. To prevent problems with consistency, a sharp transition
> between setting values and using the object is enforced. Am I right?

Yes, that's basically it.

> I guess my point was that if we boil this down to the essentials, it's
> basically the same idea as a builder pattern, just implemented slightly
> differently. In the builder pattern, a separate object (or struct, or
> whatever) is used to encapsulate the state of the object that we'd like
> it to be in, which we then pass to the ctor to create the object in that
> state. The idea is the same, though: set up a bunch of values
> representing the desired initial state of the object, then, to borrow
> Perl's terminology, "bless" it into a full-fledged class instance.

It achieves the same ends, but does it differently.  My idea requires  
compiler support (which makes it unlikely to happen) and doesn't require  
separate objects (which I think is a big plus).

>> So, to take my idea a little further - WRT class inheritance.  The
>> compiler, for a derived class, would need to inspect the invariants
>> of all classes involved (these are and-ed already), inspect the
>> constructors of the derived classes (for calls to initialise
>> members), and the initialisation block I described and verify
>> statically that an attempt was made to initialise all the members
>> which appear in all the invariants.
>
> I see. So basically the user still has to set up all required values
> before you can use the object, the advantage being that you don't have
> to manually percolate these values up the inheritance tree in the ctors.

Exactly.

> It seems to be essentially the same thing as my approach, just
> implemented differently. :)[...]

Thanks for the description of your idea.

As I understand it, in your approach all the mandatory parameters for all  
classes in the hierarchy are /always/ passed to the final child  
constructor.  In my idea a constructor in the hierarchy could chose to set  
some of the mandatory members of it's parents, and the compiler would  
detect that and would not require the initialisation block to contain  
those members.

Also, in your approach there isn't currently any enforcement that the user  
sets all the mandatory parameters of Args, and this is kinda the main  
issue my idea solves.

> One thing about your implementation that I found limiting was that you
> *have* to declare all required fields on-the-spot before the compiler
> will let your 'new' call pass, so if you have to create 5 similar
> instances of the class, you have to copy-n-paste most of the set-method
> calls:
>
> 	auto obj1 = new C() {
> 		name = "test1",
> 		age = 12,
> 		school = "D Burg High School"
> 	});
>
> [...]
>
> Whereas using my approach, you can simply reuse the Args struct several
> times:
>
> 	C.Args args;
> 	args.name = "test1";
> 	args.age = 12;
> 	args.school = "D Burg High School";
> 	auto obj1 = new C(args);
>
> 	args.name = "test2";
> 	auto obj2 = new C(args);
>
> 	args.name = "test3";
> 	auto obj3 = new C(args);
>
> 	... // etc.

Or.. you use a mixin, or better still you add a copy-constructor or .dup  
method to your class to duplicate it :)

> You can also have different functions setup different parts of C.Args:
>
> 	C createObject(C.Args args) {
> 		// N.B. only need to set a subset of fields
> 		args.school = "D Burg High School";
> 		return new C(args);
> 	}
>
> 	void main() {
> 		C.Args args;
> 		args.name = "test1";
> 		args.age = 12;		// partially setup Args
> 		auto obj = createObject(args); // createObject fills out rest of the  
> fields.
> 		...
>
> 		args.name = "test2";	// modify a few parameters
> 		auto obj2 = createObject(args); // createObject doesn't need to know  
> about this change
> 	}
>
> This is nice if there are a lot of parameters and you don't want to
> collect the setting up of all of them in one place.

In my case you can call different functions in the initialisation block,  
e.g.

void defineObject(C c)
{
   c.school = "...);
}

C c = new C() {
   defineObject()
}

:)

>> I think another interesting idea is using the builder pattern with
>> create-set-call objects.
>>
>> For example, a builder template class could inspect the object for
>> UDA's indicating a data member which is required during
>> initialisation.  It would contain a bool[] to flag each member as
>> not/initialised and expose a setMember() method which would call the
>> underlying object setMember() and return a reference to itself.
>>
>> At some point, these setMember() method would want to return another
>> template class which contained just a build() member.  I'm not sure
>> how/if this is possible in D.
> [...]
>
> Hmm, this is an interesting idea indeed. I think it may be possible to
> implement in the current language.

The issue I think is the step where you want to mutate the return type  
 from the type with setX members to the type with build().

> Maybe we can make use of UDAs to indicate which fields are mandatory

That was what I was thinking.

> [...]
> Just a rough idea, haven't actually tried to compile this code yet.

Worth a go, it doesn't require compiler support like my idea so it's far  
more likely you'll get something at the end of it.. I can just sit on my  
hands and/or try to promote my idea.

I still prefer my idea :P.  I think it's cleaner and simpler, this is in  
part because it requires compiler support and that hides the gory details,  
but also because create-set-call is a simpler style in itself.  Provided  
the weaknesses of create-set-call can be addressed I might be tempted to  
use that style.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/