Component programming

Wed Jul 31 15:37:41 PDT 2013

On Wed, Jul 31, 2013 at 09:16:20PM +0000, Justin Whear wrote:
> On Wed, 31 Jul 2013 12:20:56 +0200, Chris wrote:
> 
> > This is only losely related to D, but I don't fully understand the
> > separation of component programming and OOP
[...]
> A few things:
> 1) The functions used in Walter's example are not methods, they are
> generic free functions.  The "interfaces" they require are not actual
> OOP interfaces, but rather a description of what features the supplied
> type must supply.
> 2) The avoidance of actual objects, interfaces, and methods means that
> the costly indirections of OOP are also avoided.  The compiler is free
> to inline as much of the pipeline as it wishes.
> 3) Component programming simplifies usage requirements, OOP frameworks
> complicate usage requirements (e.g. you must inherit from this class).
> 
> If anything, component programming is just functional programming +
> templates and some nice syntactic sugar.  And a healthy dose of pure
> awesome.

Keep in mind, though, that you pay for the avoidance of OO indirections
by template bloat. Every combination of types passed to your components
will create a new instantiation of that component. In simple cases, this
is generally only a handful of copies, so it's not a big deal; but for
certain frequently-used components, this can explode to a huge amount of
duplicated code. They will all avoid "costly" OO indirections, sure, but
you pay for that with larger code size, which means higher rate of CPU
cache misses, larger memory footprint of the code, etc..

This makes me wonder if we can somehow have a "happy marriage" of the
two approaches. Is it possible to have automatic template instantiation
factoring, such that in highly-used templates that generate a lot of
copies, can the compiler be made smart enough to figure out that
automatically adding indirections to the code to reduce the number of
instantiations might be better?

One case I've come across before is containers. For the sake of
genericity, we usually use templates to implement containers like, say,
a red-black tree. However, most of the code that deals with RB trees
don't really care about what type the data is at all; they implement
algorithms that operate on the structure of the RB tree, not on the
data. Only a small subset of RB tree methods actually need to know what
type the data should be (the methods that create a new node and
initialize it with data, return the data from a node, etc.). Yet, in a
template implementation of RB trees, every single method must be
repeatedly instantiated over and over, for every type you put into the
container.

Most of these method instantiations may in fact be essentially identical
to each other, except perhaps for one or two places where it may use a
different node size, or call some comparison function on the data in the
nodes.  Ideally, the compiler should be able to know when a method of a
templated struct/class is transitively independent of the template
parameter, and only emit code for that method once. All other
instantiations of that method will simply become aliases of that one
instantiation.

This doesn't cover the case where the call chain may pass through
methods that don't care about data types but eventually ends at a method
that *does* have to care about data types; but this is solvable by
factoring the code so that the type-independent code is separated from
the type-dependent code, except for one or two runtime parameters (e.g.
size of the data type, or a function pointer to the type-dependent code
that must be called at the end of, say, a tree traversal).  The compiler
may even be able to do this automatically in certain simple cases.

T

-- 
If it breaks, you get to keep both pieces. -- Software disclaimer notice