This documentation site is for the versions of Synapse maintained by the Matrix.org Foundation (github.com/matrix-org/synapse), available under the Apache 2.0 licence.
In an ideal world, our git commit history would be a linear progression of
commits each of which contains a single change building on what came
before. Here, by way of an arbitrary example, is the top of git log --graph b2dba0607:
Note how the commit comment explains clearly what is changing and why. Also
note the absence of merge commits, as well as the absence of commits called
things like (to pick a few culprits):
“pep8”, “fix broken
test”,
“oops”,
“typo”, or “Who's
the president?”.
There are a number of reasons why keeping a clean commit history is a good
thing:
From time to time, after a change lands, it turns out to be necessary to
revert it, or to backport it to a release branch. Those operations are
much easier when the change is contained in a single commit.
Similarly, it's much easier to answer questions like “is the fix for
/publicRooms on the release branch?” if that change consists of a single
commit.
Likewise: “what has changed on this branch in the last week?” is much
clearer without merges and “pep8” commits everywhere.
Sometimes we need to figure out where a bug got introduced, or some
behaviour changed. One way of doing that is with git bisect: pick an
arbitrary commit between the known good point and the known bad point, and
see how the code behaves. However, that strategy fails if the commit you
chose is the middle of someone's epic branch in which they broke the world
before putting it back together again.
One counterargument is that it is sometimes useful to see how a PR evolved as
it went through review cycles. This is true, but that information is always
available via the GitHub UI (or via the little-known refs/pull
namespace).
Of course, in reality, things are more complicated than that. We have release
branches as well as develop and master, and we deliberately merge changes
between them. Bugs often slip through and have to be fixed later. That's all
fine: this not a cast-iron rule which must be obeyed, but an ideal to aim
towards.
Ok, so that's what we'd like to achieve. How do we achieve it?
The TL;DR is: when you come to merge a pull request, you probably want to
“squash and merge”:
.
(This applies whether you are merging your own PR, or that of another
contributor.)
“Squash and merge”1 takes all of the changes in the
PR, and bundles them into a single commit. GitHub gives you the opportunity to
edit the commit message before you confirm, and normally you should do so,
because the default will be useless (again: * woops typo is not a useful
thing to keep in the historical record).
The main problem with this approach comes when you have a series of pull
requests which build on top of one another: as soon as you squash-merge the
first PR, you'll end up with a stack of conflicts to resolve in all of the
others. In general, it's best to avoid this situation in the first place by
trying not to have multiple related PRs in flight at the same time. Still,
sometimes that's not possible and doing a regular merge is the lesser evil.
Ultimately: this is not a hard-and-fast-rule. If in doubt, ask yourself “do
each of the commits I am about to merge make sense in their own right”, but
remember that we're just doing our best to balance “keeping the commit history
clean” with other factors.
A lotofwords have been
written in the past about git branching models (no really, a
lot). I tend to
think the whole thing is overblown. Fundamentally, it's not that
complicated. Here's how we do it.
Let's start with a picture:
It looks complicated, but it's really not. There's one basic rule: anyone is
free to merge from any more-stable branch to any less-stable branch at
any time2. (The principle behind this is that if a
change is good enough for the more-stable branch, then it's also good enough go
put in a less-stable branch.)
Meanwhile, merging (or squashing, as per the above) from a less-stable to a
more-stable branch is a deliberate action in which you want to publish a change
or a set of changes to (some subset of) the world: for example, this happens
when a PR is landed, or as part of our release process.
So, what counts as a more- or less-stable branch? A little reflection will show
that our active branches are ordered thus, from more-stable to less-stable:
master (tracks our last release).
release-vX.Y (the branch where we prepare the next release)3.
The corollary is: if you have a bugfix that needs to land in both
release-vX.Yanddevelop, then you should base your PR on
release-vX.Y, get it merged there, and then merge from release-vX.Y to
develop. (If a fix lands in develop and we later need it in a
release-branch, we can of course cherry-pick it, but landing it in the release
branch first helps reduce the chance of annoying conflicts.)
[1]: “Squash and merge” is GitHub's term for this
operation. Given that there is no merge involved, I'm not convinced it's the
most intuitive name. ^
[3]: Very, very occasionally (I think this has happened once in
the history of Synapse), we've had two releases in flight at once. Obviously,
release-v1.2 is more-stable than release-v1.3. ^