Multi-commit PRs vs. multiple single-commit PRs

Vladimir Panteleev via Digitalmars-d digitalmars-d at puremagic.com
Tue Mar 21 05:49:22 PDT 2017


On Tuesday, 21 March 2017 at 11:59:42 UTC, deadalnix wrote:
> Then it should have been 2 PR or more to begin with. Splitting 
> PR in smaller ones is a good practice in general,

This is probably true for many cases, but I don't think it's a 
general truth.

First, there are extreme cases like these:

https://github.com/dlang/druntime/pull/1402
https://github.com/dlang/phobos/pull/260

I think we can agree that it would be better to have 1 pull 
request with 70 commits than 70 pull requests with 1 commit.

Second, there are many cases that fall in the middle: some 
ancillary change (such as a minor refactoring, or a build file 
change) is needed by a bigger change, but it is too minor to be a 
PR of its own. Putting both in one commit is also unwarranted.

I think that the philosophy to prefer squashing is not suited for 
projects such as D, where we care about history. Pushing for one 
change per PR also pushes people to put too many things in one 
commit, and write less descriptive commit messages.

Finally, some of the biggest open source projects merge pull 
requests consisting of multiple commits, and encourage submitters 
to divide their changes into as many commits as is logical, which 
seems to be the workflow they consider optimal.

> there are ample proof that is increase the quality of the code 
> review,

OK, where is the proof?

It is worth pointing out that GitHub's UI is heavily biased 
towards reviewing PRs in entirety, however it does allow 
reviewing PRs commit by commit, which is how I think non-trivial 
submissions and reviews should occur anyway.

> reduce conflicts surface with other PR, makes reverting easier 
> and more targeted when something happens, etc...

Sure, I'm not advocating that all submissions happen as one PR. 
The way I understand it, it is you who is advocating the extreme 
position that all PRs should always contain a single commit.

> Keeping this PR's commits is just a way to mitigate one of the 
> negative consequences of kitchen sink PRs. It does so by 
> impacting negatively others aspects of source control, and does 
> nothing to mitigate other negatives aspects of kitchen sink PRs,

Frankly I don't think this makes any sense at all.

> such as review fatigue (see a specific example below).

The other side of the coin is submitter fatigue.

I've seen this happen:

1. Submitter submits a PR, containing two commits: a change, and 
a refactoring required for the change.
2. Reviewers ask the submitter to split it into two PRs.
3. Submitter resubmits the refactoring as a separate PR.
4. The refactoring PR sits in the review queue forever because at 
best, it does nothing, at worst it introduces a regression. 
Reviewers who did not see or bother to read the first PR ask what 
this refactoring is for and why it's needed.
5. Submitter is fed up and leaves.

> Consider this PR: 
> https://github.com/BitcoinUnlimited/BitcoinUnlimited/pull/164
>
> You can see in the comments that I asked the original author to 
> split it up because it was a kitchen sink and very hard to 
> review in its current form. This was ignored. The PR ended up 
> containing a bug that would cost about $12 500 to one of the 
> users of the software, plus a fair amount of reputational 
> damage. The change containing the bug did not need to be 
> bundled with the rest of the PR, and would have almost 
> certainly be noticed if it had been made on a PR of its own.
>
> Bundling several changes in the same PR has real world 
> consequences that go beyond screwing up source control.

I don't think that's a fair example at all.

What exactly prevents reviewing a PR consisting of multiple 
commits differently from multiple PRs consisting of one commit?

In both cases, you can:
- look at each change individually
- add review comments on the change, either on the change in 
entirety or on individual lines
- selectively pick and merge a subset of the submitted changes 
(though, GitHub makes this more difficult for multi-commit PRs).

I can only guess that by "difficult to review" you mean from only 
looking at the "diff" tab; however, I think it's disingenuous to 
say that if you are not using the tools properly.

Anyway, to reiterate, this is a distinct argument from which 
merge strategy to use. However, generally, I think this approach 
is more bad than good because it pushes towards destroying 
information (commit messages and separation) and clumping too 
many minor changes into single commits.



More information about the Digitalmars-d mailing list