CI buildbots

Wed May 23 20:51:59 UTC 2018

On 23 May 2018 at 07:09, Seb via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On Monday, 21 May 2018 at 04:46:15 UTC, Manu wrote:
>>
>> This CI situation with the DMD/druntime repos is not okay.
>> It takes ages... **hours** sometimes, for CI to complete.
>> It's all this 'auto-tester' one, which seems to lock up on the last few
>> tests.
>>
>> This makes DMD is a rather unenjoyable project to contribute to.
>> I had a sudden burst of inspiration, but it's very rapidly wearing off.
>
>
> As mentioned on GitHub, running the compile+fail testsuite locally takes 10
> seconds.
> Typically a PR doesn't touch much cross-platform stuff and if it does, you
> should get a negative reply pretty early from any CI.
> If you use CIs to detect merge conflicts, they will also occur locally.
>
> As explained on GitHub auto-tester gives priority to PRs with the auto-merge
> label and will constantly invalidate old builds whenever something got
> merged, so it typically never completes for a PR.
> OTOH if your PR has been approved, it will have priority access and normally
> it will be merged quite quickly.
>
> That being said, if you have ideas on how to improve the ecosystem, please
> let us/me know (except for adding new machines to the auto-tester - that's
> something that seems to be out of our reach).

I'm not suggesting adding new machines... I'm suggesting removing the
ones that take ~50 minutes.
I think they're a let loss for the pipeline. A <10 minute machine
could finish up 4 other jobs AND finish mine in the same amount of
time. It practise it's likely to get to mine a lot sooner.
A single machine in the pipeline that takes 50 minutes makes the
pipeline take at least 50 minutes.

Latency > throughput, the pipeline would be better without those 4
machines. (win-farm-1, win-farm-2, bellevue, inglebrook)

I guess the flow I observed on the weekend, where it took me 3 days to
get my batch of PR's merged, is that people reviewed it once it was
already green... so that's at least 50 minutes at the end of my actual
work, THEN they ask me to add `package` or `const` somewhere, so I do,
and that's another hour before they have another look to confirm the
tweak, and then they probably got bored of reviewing PR's in the last
2-3 hours, and went to bed, or to work, or to get lunch or something,
so they reappear the next day.
So in practise, adding 2-3 hours to the cycles adds 24 hours to the
cycle. The latency was the problem for all of my 5-6 patches, not the
throughput.

Maybe those 4 machines could be assigned a rule, where they're only
assigned jobs to re-test PR's with updated DMD's or whatever if the PR
has been silent for >24 hours or something... so they are assigned to
low-frequency jobs, and never assigned to fresh jobs?