A brief survey of build tools, focused on D

Wed Dec 12 22:41:50 UTC 2018

On Wed, Dec 12, 2018 at 02:52:09PM -0700, Jonathan M Davis via Digitalmars-d-announce wrote:
[...]
> I would think that to be fully flexible, dub would need to abstract
> things a bit more, maybe effectively using a plugin system for builds
> so that it's possible to have a dub project that uses dub for pulling
> in dependencies but which can use whatever build system works best for
> your project (with the current dub build system being the default).
> But of course, even if that is made to work well, it then introduces
> the problem of random dub projects then needing 3rd party build
> systems that you may or may not have (which is one of the things that
> dub's current build system mostly avoids).

And here is the crux of my rant about build systems (earlier in this
thread).  There is no *technical reason* why build systems should be
constricted in this way. Today's landscape of specific projects being
inextricably tied to a specific build system is completely the wrong
approach.

Projects should not be tied to a specific build system.  Instead,
whatever build tool the author uses to build the project should export a
universal description of how to build it, in a standard format that can
be imported by any other build system. This description should be a
fully general DAG, that specifies all inputs, all outputs (including
intermediate ones), and the actions required to get from input to
output.

Armed with this build description, any build system should be able to
import as a dependency any project built with any other build system,
and be able to successfully build said dependency without even knowing
what build system was originally used to build it or what build system
it is "intended" to be built with.  I should be able to import a Gradle
project, a dub project, and an SCons project as dependencies, and be
able to use make to build everything. And my downstream users ought to
be able to build my project with tup, or any other build tool they
choose, without needing to care that I used make to build my project.

Seriously, building a lousy software project is essentially traversing a
DAG of inputs and actions in topological order.  The algorithms have
been known since decades ago, if not longer, and there is absolutely no
valid reason why we cannot import arbitrary sub-DAGs and glue it to the
main DAG, and have everything work with no additional effort, regardless
of where said sub-DAGs came from.  It's just a bunch of nodes and
labelled edges, guys!  All the rest of the complications and build
system dependencies and walled gardens are extraneous and completely
unnecessary baggage imposed upon a straightforward DAG topological walk
that any CS grad could write in less than a day.  It's ridiculous.

> On some level, dub is able to do as well as it does precisely because
> it's able to assume a bunch of stuff about D projects which is true
> the vast majority of the time, and the more it allows projects that
> don't work that way, the worse dub is going to work as a general tool,
> because it increasingly opens up problems with regards to whether you
> have the right tools or environment to build a particular project when
> using it as a dependency. However, if we don't figure out how to make
> it more flexible, then certain classes of projects really aren't going
> to work well with dub.  That's less of a problem if the project is not
> for a library (and thus does not need to be a dub package so that
> other packages can pull it in as a dependency) and if dub provides a
> good way to just make libraries available as dependencies rather than
> requiring the the ultimate target be built with dub, but even then, it
> doesn't solve the problem when the target _is_ a library (e.g. what if
> it were for wrapping a C or C++ library and needed to do a bunch of
> extra code steps for code generation and needed multiple build steps).

Well exactly, again, the monolithic approach to building software is the
wrong approach, and leads to arbitrary and needless limitations of this
sort.  DAG generation should be decoupled from build execution.  You can
use whatever tool or fancy algorithm you want to generate the lousy DAG,
but once generated, all you have to do is to export it in a standard
format, then any arbitrary number of build executors can read the
description and run it.

Again I say: projects should not be bound to this or that build system.
Instead, they should export a universal build description in a standard
format.  Whoever wants to depend on said projects can simply import the
build description and it will Just Work(tm). The build executor will
know exactly how to build the dependency independently of whatever fancy
tooling the upstream author may have used to generate the DAG.

> So, I don't know. Ultimately, what this seems to come down to is that
> all of the stuff that dub does to make things simple for the common
> case make it terrible for complex cases, but making it work well for
> complex cases would almost certainly make it _far_ worse for the
> common case. So, I don't know that we really want to be drastically
> changing how dub works, but I do think that we need to make it so that
> more is possible with it (even if it's more painful, because it's
> doing something that goes against the typical use case).
[...]

Dub's very design as a monolithic build tool, like many other build
tools out there, confines it to such needless limitations. Developing it
further in this direction is IMO a waste of time.

It's time we came back to the essentials.  Current monolithic build
systems ought to be split into two parts:

(1) Dependency detector / DAG generator.  Do whatever you need to do
here: dub-style scanning of .d imports, scan directories for .d files,
tup-style instrumenting of the compiler, type it out yourself, whatever.
The resulting DAG is stored in a standard format in a standard location
in the source tree.

(2) Build executor: read in a standard DAG and employ a standard
topological walk to transform inputs into outputs.

Every project should publish the DAG in a standard format in a standard
location. Then whenever you need that project as a dependency, you just
import its DAG into yours, and build away. Problem solved.

Now of course, in real-life implementation, there will be many more
details that need to be taken care of.  But these are the essentials:
standard DAG representation, and a standard DAG import function. Once
you have these two, there are no longer silly arbitrary limitations that
serve no other purpose than to build walled gardens and annoy users.
Everyone can use their build tool of choice, and it all Just Works(tm).
You can have any project depend on any other project, and nobody has to
worry about installing the 100th variation on `make` just to make the
dumb thing compile.

Topological walk on a DAG is a solved problem, and there is no logical
reason why it should be so danged complicated.

T

-- 
Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.