D component programming is a joke (Was: Re: Component programming)

H. S. Teoh hsteoh at quickfur.ath.cx
Wed Jul 31 17:46:06 PDT 2013


On Wed, Jul 31, 2013 at 11:52:35PM +0000, Justin Whear wrote:
> On Thu, 01 Aug 2013 00:23:52 +0200, bearophile wrote:
> > 
> > The situation should be improved for D/dmd/Phobos, otherwise such D
> > component programming remains partially a dream, or a toy.
> > 
> > Bye,
> > bearophile
> 
> I disagree with your "toy" assessment.  I've been using this chaining,
> component style for a while now and have really enjoyed the clarity
> it's brought to my code.  I hadn't realized how bug-prone non-trivial
> loops tend to be until I started writing this way and avoided them
> entirely.
[...]

One of the more influential courses I took in college was on Jackson
Structured Programming. It identified two sources of programming
complexity (i.e., where bugs are most likely to occur): (1) mismatches
between the structure of the program and the structure of the data
(e.g., you're reading an input file that has a preamble, body, and
epilogue, but your code has a single loop over lines in the file); (2)
writing loop invariants (or equivalently, loop conditions).

Most non-trivial loops in imperative code have both, which makes them
doubly prone to bugs. In the example I gave above, the mismatch between
the code structure (a single loop) and the file structure (three
sequential sections) often prompts people to add boolean flags, state
variables, and the like, in order to resolve the conflict between the
two structures. Such ad hoc structure resolutions are a breeding ground
for bugs, and often lead to complicated loop conditions, which invite
even more bugs.

In contrast, if you structure your code according to the structure of
the input (i.e., one loop for processing the preamble, one loop for
processing the body, one loop for processing the epilogue), it becomes
considerably less complex, easier to read (and write!), and far less bug
prone. Your loop conditions become simpler, and thus easier to reason
about and leave less room for bugs to hide.

But to be able to process the input in this way requires that you
encapsulate your input so that it can be processed by 3 different loops.
Once you go down that road, you start to arrive at the concept of input
ranges... then you abstract away the three loops into three components,
and behold, component style programming!

In fact, with component style programming, you can also address another
aspect of (1): when you need to simultaneously process two data
structures whose structures don't match. For example, if you want to lay
out a yearly calendar using writeln, the month/day cells must be output
in a radically different order than the logical foreach(m;1..12) {
foreach(day;1..31) } structure). Writing this code in the traditional
imperative style produces a mass of spaghettii code: either you have
bizarre loops with convoluted loop conditions for generating the dates
in the order you want to print them, or you have to fill out some kind
of grid structure in a complicated order so that you can generate the
dates in order.

Using ranges, though, this becomes considerably more tractable: you can
have an input range of dates in chronological order, two output ranges
corresponding to chunking by week / month, which feed into a third
output range that buffers the generated cells and prints them once
enough has been generated to fill a row of output. By separating out
these non-corresponding structures into separate components, you greatly
simplify the code within each component and thus reduce the number of
bugs (e.g. it's far easier to ensure you never put more than 7 days in a
week, since the weekly output range is all in one place, as opposed to
sprinkled everywhere across multiple nested loops in the imperative
style calendar code). The code that glues these components together is
also separated out and becomes easier to understand and debug: you
simply read from the input range of dates, write to the two output
ranges, and check if they are full (this isn't part of the range API but
added so for this particular example); if the weekly range is full,
start a new week; if the monthly range is full, start a new month. Then
the final output range takes care of when to actually produce output --
you just write stuff to it and don't worry about it in the glue code.

OK, this isn't really a good example of the linear pipeline style code
we're talking about, but it does show how using ranges as components can
untangle very complicated code into simple, tractable parts that are
readable and easy to debug.


T

-- 
If you compete with slaves, you become a slave. -- Norbert Wiener


More information about the Digitalmars-d mailing list