D Language Foundation April 2023 Monthly Meeting Summary
max haughton
maxhaton at gmail.com
Sun May 14 23:53:41 UTC 2023
On Sunday, 14 May 2023 at 12:47:59 UTC, Mike Parker wrote:
> __CI failures__
>
> Dennis started by saying that the CI was randomly failing
> again. He didn't have a Mac, so he'd been unable to chase down
> the problem. Random CI failures are a recurring problem. There
> are so many checks, and he doesn't know who created them or who
> knows exactly what the checks are doing. He wishes the tests
> had someone responsible for them who he could turn to when they
> fail.
>
> Walter asked who had previously been in charge of the tests.
> Razvan said he didn't recall if one person was ever in charge
> of them. At some point, someone decided it was a good idea to
> have a particular test and it got added to the pipeline.
>
> Dennis asked if we should only keep tests that have a
> maintainer. Martin and Mathias quickly rejected that. Martin
> said the tests are good. CI failures are usually caused by CI
> image bumps or a PR. CI image changes are a PITA for LDC's
> tests, and PR-related failures may not be easy to resolve, but
> failures are hardly ever the fault of a test. And there's never
> been any specific person responsible for any of DMD's CI
> systems. They just grew organically. Then someone who knew the
> details moved on and no one else knows them... it's a constant
> maintenance burden, but it's worth the effort.
>
> There was a bit more discussion about the maintenance burden,
> after which I noted that this is the story of our ecosystem.
> We're responsible now for things none of us set up, and we need
> to get a handle on it all. Dennis agreed and added that the CI
> is in a special position. When one of them is outdated, it
> doesn't just sit there out of the way, it becomes an annoyance
> to development.
>
> (NOTE: This is one of the many aspects of our ecosystem that
> we'll be working to improve [under our new
> workflow](https://forum.dlang.org/thread/avvmlvjmvdniwwxemcqu@forum.dlang.org).)
>
Some thoughts on testing:
1. This (MacOS) failure has been fixed (by me). It apparently
also occurred with some other LibCs out there prior to that too.
In future these kinds of failures must be prioritized a little
more aggressively, this didn't just mean "Oh well, we'll ignore
that pipeline for a while", it meant that Phobos effectively
didn't work on MacOS (Oops).
2. At a bigger scale: We probably have too many CI pipelines. The
main ones that I have
in mind that really could go are the OMF pipelines --- In OMF we
have some ancient baggage which we don't need and shouldn't want
to support anymore: [Microsoft barely mention OMF
anymore](https://learn.microsoft.com/en-us/search/?terms=OMF&scope=C%2B%2B), its not the default from dmd on 32bit windows anymore, and having it in the testsuite ties us to the
Digital Mars ecosystem for likely zero benefit (Would you,
reader, use Digital Mars if you were building C code on Windows
today?)
3. The testing process could also use some love in terms of
exactly how they're setup. Does everything that should/could use
the host compiler use that compiler? Although I think its partly
his own doing in not exerting much control over the compiler
codebase other than when others try to organize it, Walter is
right that the test suite should ideally be segmented into tests
ordered by some measure of the number of features they depend
upon.
4. We should have either digger or something like digger (likely
a shell script, shooting from the hip I think digger is a very
good idea but too complicated, myself and others have all had it
not work in mysterious ways) being checked on every PR to make
sure its easy to reproduce.
5. Automatic bisect? When github issues are done this could be an
interesting use of richer integration with the concept of an
issue to make developers productive. When a bug report is filed
it, finding the commit that caused the issue can and should be
done by a bot.
More information about the Digitalmars-d-announce
mailing list