Button: A fast, correct, and elegantly simple build system.

Fri Jun 17 13:36:53 PDT 2016

On Fri, Jun 17, 2016 at 07:30:42PM +0000, Fool via Digitalmars-d-announce wrote:
> On Friday, 17 June 2016 at 08:23:50 UTC, Atila Neves wrote:
> > I agree, but CMake/ninja, tup, regga/ninja, reggae/binary are all
> > correct _and_ fast.
> 
> 'Correct' referring to which standards? There is an interesting series
> of blog posts by Mike Shal:
> 
> http://gittup.org/blog/2014/03/6-clobber-builds-part-1---missing-dependencies/
> http://gittup.org/blog/2014/05/7-clobber-builds-part-2---fixing-missing-dependencies/
> http://gittup.org/blog/2014/06/8-clobber-builds-part-3---other-clobber-causes/
> http://gittup.org/blog/2015/03/13-clobber-builds-part-4---fixing-other-clobber-causes/

To me, "correct" means:

- After invoking the build tool, the workspace *always* reflects a
  valid, reproducible build. Regardless of initial conditions, existence
  or non-existence of intermediate files, stale files, temporary files,
  or other detritus. Independent of environmental factors. Regardless of
  whether a previous build invocation was interrupted in the middle --
  the build system should be able to continue where it left off,
  reproduce any partial build products, and produce exactly the same
  products, bit for bit, as if it had not been interrupted before.

- If anything changes -- and I mean literally ANYTHING -- that might
  cause the build products to be different in some way, the build tool
  should detect that and update the affected targets accordingly the
  next time it's invoked.  "Anything" includes (but is not limited to):

   - The contents of source files, even if the timestamp stays
     identical to the previous version.

   - Change in compiler flags, or any change to the build script itself;

   - A new version of the compiler was installed on the system;

   - A system library was upgraded / a new library was installed that
     may get picked up at link time;

   - Change in environment variables that might cause some of the build
     commands to work differently (yes I know this is a bad thing -- it
     is not recommended to have your build depend on this, but the point
     is that if it does, the build tool ought to detect it).

   - Editing comments in a source file (what if there's a script that
     parses comments? Or ddoc?);

   - Reverting a patch (that may leave stray source files introduced by
     the patch).

   - Interrupting a build in the middle -- the build system should be
     able to detect any partially-built products and correctly rebuild
     them instead of picking up a potentially corrupted object in the
     next operation in the pipeline.

- As much as is practical, all unnecessary work should be elided. For
  example:

   - If I edit a comment in a source file, and there's an intermediate
     compile stage where an object file is produced, and the object file
     after the change is identical to the one produced by the previous
     compilation, then any further actions -- linking, archiving, etc.
     -- should not be done, because all products will be identical.

   - More generally, if my build consists of source file A, which gets
     compiled to intermediate product B, which in turn is used to
     produce final product C, then if A is modified, the build system
     should regenerate B. But if the new B is identical to the old B,
     then C should *not* be regenerated again.

      - Contrariwise, if modifications are made to B, the build system
	should NOT use the modified B to generate C; instead, it should
	detect that B is out-of-date w.r.t. A, and regenerate B from A
	first, and then proceed to generate C if it would be different
	from before.

   - Touching the timestamp of a source file or intermediate file should
     *not* cause the build system to rebuild that target, if the result
     will actually be bit-for-bit identical with the old product.

   - In spite of this work elision, the build system should still ensure
     that the final build products are 100% reproducible. That is, work
     is elided if and only if it is actually unnecessary; if a comment
     change actually causes something to change (e.g., ddocs are
     different now), then the build system must rebuild all affected
     subsequent targets.

- Assuming that a revision control system is in place, and a workspace
  is checked out on revision X with no further modifications, then
  invoking the build tool should ALWAYS, without any exceptions, produce
  exactly the same outputs, bit for bit.  I.e., if your workspace
  faithfully represents revision X in the RCS, then invoking the build
  tool will produce the exact same binary products as anybody else who
  checks out revision X, regardless of their initial starting
  conditions.

   - E.g., I may be on revision Y, then I run svn update -rX, and there
     may be stray intermediate files strewn around my workspace that are
     not in a fresh checkout of revision X, the build tool should still
     produce exactly the same products as a clean, fresh checkout of
     revision X.  This holds regardless of whether Y represents an older
     revision or a newer revision, or a different branch, etc..

   - In other words, the build system should be 100% reproducible at all
     times, and should not be affected by the existence (or
     non-existence) of any stale intermediate files.

By the above definition of correctness, Make (and pretty much anything
based on it, that I know of) fails on several counts.  Systems like
SCons come close to full correctness, and I believe tup can also be made
correct in this way.  Make, however, by its very design cannot possibly
meet all of the above requirements simultaneously, and thus fails my
definition of correctness.

T

-- 
A bend in the road is not the end of the road unless you fail to make the turn. -- Brian White