Static Const Object Polymorphism

Mon Mar 23 12:17:14 PDT 2015

Hi folks.

I just recently started to discovered the D language,
I watched some videos about patterns, but I havent really
something that comes close to the pattern I'm going to present.
I hope the format of this post is correct.If so, I'm sorry to
have wasted your time. The idea behind this pattern is quite
simple, I call it
compile time object orientation. In essence we are replicating
some parts of the object oriented model, just at compile time.
The result will be extremly flexible and easily configurable code,
which to my knowledge can't be produced effectively otherwise,
at least not in languages like C, C++, Java, C# etc. (and maybe
D?)
The code that is used can be found here:
https://github.com/klodkevin/static_polymorphism_D

Let's first start with the problem we are trying to solve.
We are going to build a really generic Queue with
a lot of compile time options, that is:

- The type of the container item.
- A pop behavior that defines results for calling pop on an empty
queue. (Exception, return null, etc.)
- size parameter, with if set to true we get additional function
returning us the size.
- threaded behavior, setting this flag to true will give us
thread safety for each operation
- debugFlag, a flag that prints some debug info.

Parts of the queue is implemented in the source in queue.d.
Now the idea of this pattern is to simply create a compile time
constant object
called QueueOptions, that groups all these options together.
So instead of instantiating the Queue with all five parameters,
we simply have one container.

Here is some code:

enum EmptyBehavior
{
	nullvalue,
	exception,
	ignored	
}

// queue option definition
class QueueOptions
{
	mixin polymorphType!("type", string);
	mixin polymorphField!(bool, "threaded", false);
	mixin polymorphField!(bool, "size", true);
	mixin polymorphField!(EmptyBehavior, "behavior",
EmptyBehavior.nullvalue);
	mixin polymorphField!(bool, "debugFlag", false);
}

// implementation for a queue.
class Queue(T)
{
private:

	// poly
	mixin polymorphCreate!("Q", T);

	static if (Q.size)
		size_t size;

	// mutex for storage
	static if (Q.threaded)
		Mutex m;

	... // further implementation.
}

// our class
class MyClass
{
	this(bool t)
	{
		value = t;
	}
	bool value;

	void print()
	{
		writeln("Have: " ~ std.conv.to!string(value));
	}
}

// overrides some of our default options
class Options : QueueOptions
{
	mixin polymorphType!("type", MyClass);
	mixin polymorphField!(behavior, EmptyBehavior.exception);
}

// some test
unittest
{
	auto queue = new Queue!Options();
	queue.push(new MyClass(true));
	queue.push(new MyClass(false));
	queue.push(new MyClass(false));

	// remove and print
	queue.pop().print();
	queue.pop().print();
	queue.pop().print();

	// throws exception if uncommented.
	//queue.pop();
}

What we do here is we apply some "compiler magic" to "override"
static const fields.
Here the QueueOptions class defines our five fields with default
values,
and the Options class derives from it, overriding the
EmptyBehavior
and the stored class. We now can use the class pretty easily and
uncommenting the last line yields to a thrown exception, as
specified.

To summarize this example, even if we had like 20 compile time
parameters,
we would still be able to get some nice syntax and a really
flexible way
of using our queue, depending on the need of the user.

The next example covers a sort algorithm.
The idea is the same as before. We make our sort algorithm really
flexible,
if we allow for some additional compile time object that contains
the info
on how to implement the sort. Here we define:

enum SortFunc
{
	normal,
	quick,
	bubble,
	random
}

class SortOptions
{
	mixin polymorphField!(string, "pred", "a < b");
	mixin polymorphField!(bool, "debugFlag", false);
}

class NormalSortOptions : SortOptions
{
	mixin polymorphField!(SortFunc, "method", SortFunc.normal);
	mixin polymorphField!(SwapStrategy, "strategy",
SwapStrategy.unstable);
}

class RandomSortOptions : SortOptions
{
	mixin polymorphField!(SortFunc, "method", SortFunc.random);
	mixin polymorphField!(int, "rng_param", 2);
}

unittest
{
	// specify our sort function.
	class Options2: RandomSortOptions
	{
		mixin polymorphField!(rng_param, 3);
		mixin polymorphField!(debugFlag, true);
	}
	void rsort(Range)(Range r) { csort!(Options2)(r); }

	// do some testing.
	int[] a = [2, 3, 4, 1];
	rsort(a);
	assert(a == [1, 2, 3, 4]);
}

csort is our custom sort method. (sort.d)
As we see we override the default sort method
to use our random sort (ok the implementation is just a stub in
sort.d for demonstration purposes).
Again we can add a tremendous amount of sort options via such
compile time objects with
this method, making it easy for the user to choose what strategy
to use
and not bloating the interface of the sort method.

For the next example we get really cocky. We want a sorted array,
where we can
insert elements of any type that we can compare and remain sorted.
Of course we add some Sorted array compile time static object:

class SortedArrayOptions
{
	mixin polymorphType!("type", string);
	mixin polymorphField!(bool, "threaded", false);
	mixin polymorphField!(bool, "debugFlag", false);
	mixin polymorphType!("sortType", NormalSortOptions);
}

Now we have the field sortType, which contains our method
information used for sorting.
So in total, we have a lot of compile time choices to make:

- All sorting options, that is 4 parameters for sortType.
We also note here that if one sort option would have 8 parameters
and the other just 6, we would not need to change any code here.

- 3 parameters for our sorted array.

So in total we end up with 7 template parameters we want to use.
Our SortedArray implementation must be complicated. Wrong!

class SortedArray(T)
{
	mixin polymorphCreate!("Q", T);

	typeof(Q.type)[] data;

	this()
	{
		data = new typeof(Q.type)[0];
	}

	void insert(typeof(Q.type)[] objs)
	{
		static if (Q.debugFlag)
			writeln("inserting.");

		data ~= objs;
		csort!(typeof(Q.sortType), typeof(Q.type)[])(data);
	}

	auto get()
	{
		return data;
	}
}

Although we didn't implement threaded and other methods at all,
the concepts
of sort is completly independent of the container options.
This means in fact that we just can forward our template
parameters
to our sort function immediatly, without even checking them.
Changing the sort method to accept more options now leads to next
to no changes
to our sortedArray class. Also a variable number of template
arguments for the sort method leads to no changes.
Here is some unittest that demonstrates the usage:

unittest
{

	// define our customized sorted array
	class Options : SortedArrayOptions
	{
		class CustomSort : RandomSortOptions
		{
			mixin polymorphField!(rng_param, -9);
			mixin polymorphField!(pred, "a > b");
			mixin polymorphField!(debugFlag, true);
		}

		// sort option
		mixin polymorphType!("sortType", CustomSort);
		mixin polymorphField!(debugFlag, true);
		mixin polymorphType!("type", int);
	}
	alias Sarray = SortedArray!(Options);

	// do some testing
	int[] unsorted1 = [2, 3, 4, 1];
	int[] unsorted2 = [-1, 0, 5, 6];
	auto d = new Sarray();
	d.insert(unsorted1);
	d.insert(unsorted2);
	assert(d.get() == [6, 5, 4, 3, 2, 1, 0, -1]);

	// Use our already defined behavior Sarray, just change the type
to float
	class Options2 : Options
	{
		mixin polymorphType!("type", float);
	}
	alias SarrayFloat = SortedArray!(Options2);

	// do some more testing
	float[] unsorted1f = [2f, 3f, 4f, 1f];
	float[] unsorted2f = [-1f, 0f, 5f, 6f];
	auto df = new SarrayFloat();
	df.insert(unsorted1f);
	df.insert(unsorted2f);
	assert(df.get() == [6f, 5f, 4f, 3f, 2f, 1f, 0f, -1f]);

	// does not compile, as we have the type float
	//df.insert(unsorted1);
}

As you can see the usage is quite simple as well flexible.
Of course the syntax might be better, but im not a magician.

  From a more abstract view the following is going on using compile
time object orientation:

We want to combine normal objects using aggregation to build new
objects.
Unfortunately aggregate templated objects for which we can change
the behavior at compile time leads to a bloat of the template
signature.
Think of it. Lets assume we want to create a general purpose
configuration parser for JSON.
This file parser has probabily some of the following methods or
behavior
which we want to choose at compile time:

- an allocator to be specified for the data we store?
- the memory model to be specified?
- The file operations (buffered, unbuffered, stream options, file
format, local buffer), ability to write?
- Thread safety, async operations, callbacks?
- Error handling (checking return null or throw exception, error
handling for async operations?)
- Parsing engine to be used?

As you can see that would lead to huge bloat of template
signature,
which would essentially be unable to handle. But I can imagine
that some parts
of it the implementation is somehow orthogonal, that is the file
operations, the memory model, the allocator, etc..
This means we would just forward our compile time static object
to the allocator, the file object
implementing the file operations etc.

On the other hand this parser would be really flexible and by
sticking with
default configurations easy to use. (Maybe just override some 2
or 3 options, or just overriding memory model)
If we want more specific usage we can maybe use a better fitting
file implemention
specific to our needs (e.g. better performance). Also we don't
have to repeat
the usage pattern every time, just declare one class and use one
alias.
So having a really general model and beeing efficient
is not neccesarily an opposite if the class is well designed.

For methods we can basically do the same. Lets assume we don't
have compile time static objects and
want to specify our really general method. Well, there are
basically two options on how to handle this:

- Adding an object controlling what will be executed as method
parameter:
This is ugly. Think of it. Really. Why adding a run time parameter
to your method if you want to specify the function at compile
time.
Think of the implications if you want your class to be generic.
Think of the performance. Add a const field to the class that
gets initialized by the constructor?
Adding a template parameter to your class and then instantiate an
object of this class
to pass it to the method? Doesn't sound clever to me.

- Overloading:
Make a number of x methods on e.g. sort, bubbleSort, quickSort,
etc. with different parameter inputs.
This works good for few sort options. We might even add templates
to it to specify the overloading but then we have the signature
bloat problem
for more advanced options or classes that use this method.
On the otherhand deciding what sort function we actually want
to use based the type of the function arguments might not be what
we want.

Looking at a sort!(T)(args) function we make the following
observation (Ob):

What we want to do: sort
So basically the name is acts like an interface, it tells us WHAT
we want to do.

Template arguments: T
The template arguments specify HOW we implement the operation.
Using a compile time constant object we can effectively "hide"
this fact easily,
even if the operation is actually really complicated and can be
customized a lot.
This means for example library code can just forward the user
specified implementation
(i.e. of the mempory model) to the underlying handler, without
givin much care.
(and no bloat of signature)

Function arguments: args
We know what we do.
We know how we do it.
Use the args to actually do it.

To start the Discussion:
I think support for a feature like this in the core language
would be a great benefit.
Im going to talk a bit about why.

1. Object Orientation
In essence we are just replicating whats already there for the
runtime. That is object orientation.
Just for compile time. We can group template arguments together,
we add inheritence,
using virtual compile time constansts (must be set by the
deriving classes)
which basically gives us flexibility, ease to use and more
expressive power.
So by utilizing a good compile time static object model we would
be able to add a common
tool of solving programming problems (that is OO) to D at compile
time.
The exact implementation may vary, but it seems to be easier to
use (at least for me),
especially if your background is an OO language.

2. Performance and flexibility
Since D tries to be a flexible and performant language,
this tool is basically perfectly suited for it. We can
write general purpose default methods and really complicated
low level operations for certain tasks. Changing methods
for internal computation of aggregated types would then just mean
overriding
a variable to specify the other method. Also adding features
and new methods is easy, just specify the usage in the compile
time object later,
or if needed.

3. Better API design
Look at std::fstream of C++. At the constructor you can set
wether you want to open it
as input, as output, as both, as binary or append or....
Looking at our observation (Ob) this is bad style.
The actual used implementation should be specified by a compile
time argument,
if we don't want to change it at runtime.

So in none working pseudo code:

// maybe in client code or in library code as common usage pattern
meta Options : FStreamOptions
{
	mode = read,
	type = binary
}
alias Input = fstream!Options;

// here
Input in(file.a);
in.read();

// somewhere else
Input in(file.a);
in.read();

// somewhere else from douchebag
Input in(file.a);
in.write();

Now the write shouldn't be available (I know that there is
ofstream),
but the same can be said about the exception flag.
(Have you ever used a file in exception mode and then switched
back or vice versa?
Really? Why is there a failbit flag in exception mode? Do you
really need that?)
Of course you could add C++ templates to it, but obviously they
are an abomination.
(fstream with like six template parameters for general usage?
Goodbye.)
Another example would be Java: UnsupportedOperationException.
Due to how OO is designed there, one may need to throw an
unsupported operation exception
during runtime due to a call to an operation that is not allowed.
(If its not allowed in all cases, then why is the method even
there?)
Additionally for library writers, the compile time static object
may contain logic
to verify if the input is correct or not. Now the user just has
to look at the implementation
of the compile time static object to check what is allowed,
instead of the usual
class x(T)
if(someHugeBulkOfArgumentsSinceWeHaveLike20TemplateParameters).

Another example e.g. in Java is:
ArrayBlockingQueue, ArrayDeque, ConcurrentLinkedQueue,
DelayQueue, LinkedBlockingQueue, LinkedList, SynchronousQueue,
etc.

Are some queue implementations. Using some compile time static
objects we could just use "Queue!Options"
and in options specify our exact implementation, i.e. if the
implementation is thread safe, if implemented
as deque, change some default deque parameters, set someblocking
paramter etc. What do you do when you want to
use a SynchronousArrayQueue? Or a ConcurrentArrayQueue in Java? I
hope you get the point.
The Java way is to not use conditional compilation and create for
each possible configuration your own class or use some fancy
inheritence.
It works well most of the time, but sometimes its just screwed up.
Because essentially what Java does is doing Queue overload the
hard way. It is not clear what implementations
are possible, each is not modifiable at compile time etc. With
compile time static objects the possible
usage pattern would just be in the Options class, which can
easily be looked at.
Also we don't have an explosion of possible classes or bloat
using generics.

4. Expressiveness
Also the above example clearly shows the expressiveness of this
method.
We can group together usage patterns from code to compile time in
a compile time static object
(here we call the class meta). We could also add a file variable
to FStreamOptions,
which is then opened for reading, if the file is known at compile
time.
(the constructor with the file name may be removed then?)
That means we remove the stuff that is known at compile time and
factor it out so that we see the actual control flow of our
program more clearly.
Also specifying methods and objects implementations to use is
just setting equality to some variables,
no fancy overriding or setting of function pointers or adding
runtime variables
to functions to choose the actual implementation if you know in
advance.

5. Easy to add
Well I have no idea of the compiler, but I think that the feature
shouldn't be that
difficult to add, since we just have to choose the last
overriding value,
like choosing the last overriding function for normal classes.
(thats essentially what I do in my buggy implementation)
One might also think of adding multiple inheritence, since we are
purely
value based and hence avoid some difficulty of multiple
inheritance.
(well not the diamond problem.)

There are probably more advantages to this, but let's talk about
some disadvantages.
Of course, like any language feature they can be abused.

1. Static if hell
Putting to much features in one class can lead to difficult code.
To much static ifs on like 20 parameters, especially if they are
not orthogonal
and all interact with each other are a real pain. The possible
numbers
of classes are 2 to the power of 20, if we just assume bool
variables.
If they all interact with each other, it gets really messy. So
some
kind of low coupling of the variables is required. If not, it
gets really messy.
Using compile time static objects makes it easier to do such
stuff.

2. Setting variables to be compile time when they really should
be run time.
In one of my examples I have written rng_param, the parameter for
the random sort.
Now this parameter is compile time. If I needed to change that
behavior at runtime,
I can basically throw away all code that uses this random number
generator.
On the other hand one may add a flag to indicate
wether the rng_param could be changeable at runtime, but to
implement this behavior we
have to do extra work.

3. Complexity with inheritence
Using the presented pattern it is relatively easy to build large
structures that are determined
completly at compile time.
Other languages like Java, C# or even C++ often fall back to
interfaces
to write some kind of polymorph code for a large codebasis or
hide the implementation.
Using compile time static objects we don't need that anymore, at
least in some cases.
If we use the strategy pattern using templates instead of
inheritance,
we may end up going 10 layers of static compile time inheritance
down in our static compile time object
to the one point where it is used. And we can modify the strategy
that is used by simply overriding a value.
One may start of thinking to use some kind of database for
configuration at some point, but ok.
Generally speaking, it can get really complex because the
implementation is not really "hidden",
it is there implicitly all the time. Maybe thats what we want,
maybe not. Maybe it's not a
problem, since we don't inherit that deep and we use this pattern
mainly in library code, but still.
On the other hand, there is basically no real world experience
(not to my knowledge), so I can't tell.

What are your thoughts on this?

klodkevin