Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What is a merge queue, and does your team need one? (graphite.dev)
69 points by dbalatero on July 18, 2023 | hide | past | favorite | 52 comments


Think if you have that many developers you need to start doing it Linux kernel style and have maintainers responsible for certain parts of the project making sure merges go in fine.

And maybe have actual communication in other means than via comments on tickets. The whole hypothetical example could be avoided by typing "hey, I'm changing this field to be that, anything against?" on team's chat

> Since my_branch was merged before it was rebased onto someone_elses_branch, this error goes unnoticed and BAM - main is broken! This is known as a semantic merge conflict — a merge is technically possible, but results in a regression.

Do people not merge master into their branch before pushing? All that seems like mostly problem caused by having too long living branches.

I'm also confused how that even fixes anything, someone still have to go back and fix that code.


> Do people not merge master into their branch before pushing? All that seems like mostly problem caused by having too long living branches.

On projects with a high commit velocity and a large team, simply merging main and running the tests is not enough. The likelihood that a new commit lands on main before your tests finish is high, and restarting the process every time there's a new commit is time consuming. Merge queues are the only way to ensure that main does not break from conflicts that arise "in between" commits.


> Do people not merge master into their branch before pushing?

No, they rebase! (Sorry, couldn't resist, yeah the other half does this (!attention, opinion!) ugly cross merges that make the commit graph look like my cables under the desk...

> Merge queues are the only way to ensure that main does not break from conflicts that arise "in between" commits.

Not the only one (though you may argue this is a degenerated queue): use merge locks! Yeah, we really use git svn style, the locking is done via a chat channel. No, please don't laugh. Acquire the lock, merge develop or rebase, check everything again, release lock :D :D (No again: I am not suggesting this to anyone!! :D )


The point the GP was making is that if merging to main is frequent (bigger team) and build times are long (bigger projects), catching your branch up to master before merging can be hard, as master is moving ahead faster than you rebase+build.


Fully understood.. we also have a big team, a too slow CI, and much too many LOCs. The "merge lock" will ensure you catch up, you keep the lock as long as it takes (or until others get too annoyed and urge you to step back) - as long as you have it nothing comes between you and main at the point you claimed it.


That sounds like an awful dev loop.


Right but biggest projects out there like Linux kernel manage it just fine

> The likelihood that a new commit lands on main before your tests finish is high, and restarting the process every time there's a new commit is time consuming. Merge queues are the only way to ensure that main does not break from conflicts that arise "in between" commits.

But the code is still broken and still someone has to fix it ? That doesn't help in that.

As for "broken main", how is just not allowing code that doesn't pass tests to merge to main not enough ?

It all seems like a lot of complexity added because devs don't want to change shitty work practices...



> Do people not merge master into their branch before pushing? All that seems like mostly problem caused by having too long living branches.

Well, yes, but you have to run your tests.

The article outlines this scenario, but to give a more concrete definition, imagine your tests take 10 minutes, and this timeline happens:

10:00 Co-worker rebases their branch with master and begins the unit testing.

10:05 You rebase your branch with master and begin unit testing.

10:10 Co-worker's tests finish and everything looks green, pushes to master.

10:15 Your tests are done and everything is green. You try to push, but your branch is now out of date because master contains your co-workers changes! You rebase on master, restart your tests...

10:25 Your tests finish, it's green, and you finally push to master.

Ten minutes got wasted. It doesn't seem like a lot, but with enough engineers (Or long enough testing time), this scenario can happen a lot, especially with very short-lived branches.


And how merge queue not wastes that time ?

Someone will have to take that step to update their code and fix it then test


With the merge queue, it works like this:

10:00 Co-worker pushes their code to the merge queue. CI/CD pipeline begins running unit tests.

10:05 You push your code to the queue.

10:10 CI/CD finishes the tests. It looks green. Their code gets merged to master. CI/CD starts running your unit tests with the new master branch and your code applied.

10:20 CI/CD finishes the tests with your code, tests are green, and merges to master.

Five minutes were saved by using a merge queue.


> The whole hypothetical example could be avoided by typing "hey, I'm changing this field to be that, anything against?" on team's chat

This doesn't scale. With 15 developers, you could be working on ten different features at the same time; getting alignment on any field change (not even APIs) between that many people and projects does not scale.

And 15 is still considered a small team.

> Do people not merge master into their branch before pushing? All that seems like mostly problem caused by having too long living branches.

Again, this does not scale . Every time you merge master into your branch, you have to wait for the CI to pass before you can merge. If in the meantime someone else merges their change before you - and this happens asynchronously, whenever it's ready / whenever someone gets to it - you have to merge master again and wait for CI again. And that takes your focus off of whatever else you want to work on next, if you're waiting for CI and hoping nobody else merges before you, you're epitomizing the "Compiling!" comic.

Your idea works, but only for smaller teams, faster CI pipelines, and lower frequency of merge attempts.

Merge queues on the other hand is fire & forget; it enforces a linear process - all features will be "master + this one feature" so you don't get two merges interfering with each other - and it allows the owner to move on to the next. Unless CI breaks, which is actionable. It allows for an asynchronous process with less gatekeeping, coordination, waiting for things to happen, etc. It scales better.


When i starter my career working for large corp we had dedicated people for this called Release Engineers. The concept of specialization seems to have fallen out of favor within engineering orgs…


I'm convinced that all repositories with more than a few hundred PRs definitely need a merge queue. Glad to see discussion in this space, I've been frustrated by the lack of merge queues at previous companies.


Agreed. It's a common problem that many teams don't even realise they have.

Same thing with stacked PRs. It's a wonder Github hasn't shipped with merge queue+stacked PRs by now.


Graphite.dev is stacked PRs for github


If a merge queue is light and fast, all repos could benefit from having it enabled by default


Large teams need things beyond a merge queue, they need a merge graph: https://trunk.io/blog?post=trunk-merge


You have everything well explained here : https://blog.mergify.com/whats-a-merge-queue-and-why-use-it/


The first description I read of a merge queue came from Graydon Hoare, the creator of the Rust programming language, describing a CI system his team cobbled together in 2001[0]. He called this system "The Not Rocket Science Rule Of Software Engineering"

Zuul, of course, had this back in 2012.[1]

2023: GitHub catching up!

[0]: <https://graydon2.dreamwidth.org/1597.html> [1]: <https://zuul-ci.org/>


I presume merge queues are useless if you have flaky tests?


They're super annoying if you have flaky tests. If the tests are such that you can automate retrying and get to a reasonably low false-positive rate, it can be usable.


I like adopting a merge queue as a forcing function to deal with flaky tests.


Why jump to a merge queue? Github has had a feature to grey-out the merge button if the tests aren't passing for years.


Excited to see this thread. Aviator also offers a high performance MergeQueue with flaky test support: https://docs.aviator.co/mergequeue/managing-flaky-tests-in-m...

Disclaimer: I'm the co-founder of Aviator


A good MQ impl allows you to make the MQ checks different than in PR. And you choose for MQ builds only a subset of tests which are known to be non-flaky.

Merge queue builds should be fast and rock stable, following the 80/20 rule.


Not useless, you get sprurious rejections and have to resubmit, like running CI. and you can retry—on-failure to increase the odds. Some test frameworks support rerunning flaky tests.

They’re a good justification to work on that tho.


From what I've observed semantic merge issues always get picked up at compile time - hence only the compile phase need pass to allow a queued merge through, but I guess if you're working a lot with weakly-typed or interpreted languages it'd be different.

It's hard not to think there should be some way of automatically flagging likely issues where, say, commits in one PR refer to particular identifiers that were modified in parallel PRs, but it definitely goes well beyond what diff-merge tools typically do today.


> From what I've observed semantic merge issues always get picked up at compile time - hence only the compile phase need pass to allow a queued merge through, but I guess if you're working a lot with weakly-typed or interpreted languages it'd be different.

I agree that 90% of the time, with strongly typed languages, a compile is enough.

But even in those languages, you can still get semantic merge issues which aren’t compilation-related.

Suppose your app has a config file. The first PR renames one of the configuration keys in the config file, the second PR adds some new code which reads that configuration key. The combination compiles fine, but the second PR fails in integration tests, because the config key it was expecting wasn’t there any more.

Of course this is a sign of a design flaw - the name of the config key should occur only once in the program (say as a constant), so both PRs would use the constant, and the second PR would get the change in constant value from the first, or else the first renames the constant so the second doesn’t compile. Or even a higher-level API which deals with the semantics of what the config represents instead of just key-value strings.

But a lot of large legacy code bases contain design flaws like this, and fixing them can be a lot of work, so they don’t get fixed, at least not any time soon.


Sure, there are always going to be occasions where a combination of two bits of work done separately cause unexpected behaviour changes - but generally if there are integration/automated E2E tests that are capable of picking that sort of thing up I'd expect they typically get triggered after the mainline merge anyway. OTOH, I can understand at least wanting to ensure that it's not possible for two consecutive merges to the mainline branch to cause it stop building.


If your fancy SaaS git repo thingy doesn't support merge queue's, maintain your own using the `--update-refs` option to git.

https://andrewlock.net/working-with-stacked-branches-in-git-...


Oh, `--update-refs` is cool. I think this is what I want sometimes. I'll have a commit I want to test results of against branchA and branchB (e.g. I have `master` and `branchB` with two different implementations of some thing and I want to test some code in ref Y against it. Previously, I had to create a separate branch that was master+Y and branchB+Y and I might alter the branchB and have to rebase branchB+Y on it.

Actually I think it doesn't solve the problem now that I wrote it out, but it is interesting nonetheless.


Hmm, I think there may be some confusion. A merge queue != stacked branches.


Sure, but we use a stacked branch to solve the same problem that a merge queue solves - preventing semantic merge conflicts by predetermining a merge order.


yeah, it's really interesting, I've seen this confusion before


If somebody from Github is reading this: when will merge queues land in Github Enterprise Server? I work on a project that badly needs the feature.


If you are looking for a merge queue solution on your GHES, wait after GitHub releases it or start using Mergify.com


FWIW, graphite.dev's MQ supports GitHub Enterprise Server


Just looked at graphite.dev and it looks like it only works with Github.

Wild that an entire product can exist that depends on another commercial SaaS offering to have any value at all. That feels absurdly risky to me. Like, I am getting anxiety from the thought of working on that product.

Then again, I guess AWS exists and we all make our peace with it.

We really don’t like solid foundations do we.


It’s risky but at the same time it allows creating a significant spike of value without spreading thin.

Like, github PRs are pretty shit, which is an opportunity for reviewable.


> Like, github PRs are pretty shit, which is an opportunity for reviewable.

I've used a bunch of VCS's and managed solutions, and PR's are probably the _best_ I've had the pleasure of using. Working with perforce, everyone is one (or a small number) of long lived branch(es), chucking around WIP code as shelves. PlasticSCM/Unity DevOps Version Control (seriously, that's what they renamed it to) provides absolutely no support for a merge-based workflow other than some half-baked tools that barely work.


> I've used a bunch of VCS's and managed solutions, and PR's are probably the _best_ I've had the pleasure of using.

That is genuinely sad.


Could you suggest something better then?


GitHub's market share percentage is a multiple of AWS's — and also, GitHub is just the beginning for us!


> We really don’t like solid foundations do we.

Isn't it all built (run?) on sand?


Also shameless plug, check out out Aviator.co also provides merge queue support for GHE self-hosted and cloud.


Or use zuul. That solves your problem.


What's the difference between a merge queue and a pull queue?


It's like a rebase queue, but with more ref-log, and less tree inversion. Pull queues counter-intuitively lean on force-push-with-lease sub-commands, whereas monorepo merge queues have remote build caches for... sorry I blacked out.


I bought that hook line and sinker :) I was even feeling a real blackout start at the word "remote". Comedic timing is hard to pull off in text. Is any of that even real?


nah, not real, I was just being silly :)


[flagged]


why




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: