D stability testing framework

Thu May 23 07:30:16 PDT 2013

Hello all,

Having listened to Andrei and Walter's Q&A and read some of the discussion
surrounding it, an idea occurred to me.

How about leveraging the selection of 3rd-party D code out there to provide a
testing framework for D's stability as a language?

The framework would pull in a specified version of dmd, druntime and Phobos, and
build them; and then using those, attempt to pull in and build a selection of
3rd-party libraries and to run test code those libraries would provide.  It
would then report the number of build failures and runtime/unittest failures and
try to classify them according to whether they are due to a change in the
_language_, a change in the runtime, or a change in Phobos.

The idea would be to get reliable statistics on what breaking changes are
causing what degree of pain (and why) for D users.  Currently we have arguments
over how stable the language is and what kind of breakages are or aren't
acceptable.  With a good enough selection of 3rd-party code, it might be
possible to quantify the prospective impact of a change.

Ideally the test code could be provided simply by providing a list of git
repositories and branches to test.  A practical stumbling block might be
different build systems etc., so it might be necessary to have some kind of
standardized build system the testing framework could expect (e.g. a testbuild.d
script, just as some libraries currently provide a build.d script).

The goal would be to provide mutual benefit for the D language and providers of
public test code -- patches to D, druntime or Phobos could be tested to see the
extent of breakage they cause, providers of test code could get automated early
warning that a change to the D frontend, druntime or Phobos is going to impact
their project (and a framework to test patches designed to cope with those
breaking changes).  It should also be possible to run the testing framework on
one's personal machine (so for example, downstream users could test their code
against latest-off-GitHub versions of dmd, druntime and Phobos without having to
make their code publicly available).

I don't know if this idea is practical in reality -- it might be difficult to
distinguish between breakages caused by changes to D and breakages caused by
other problems such as incorrect testbuild scripts, etc.  It's also cheeky as I
definitely don't have the knowhow or time to do this myself.  But I thought I'd
throw it out there to see if it's an idea worth pursuing by someone.

Best wishes,

    -- Joe