Friday, February 8, 2013

Branching Is Easy. So? Git-flow Is Not Agile.

I've had roughly the same conversation four times now. It starts with the question of our deployment/development strategy, and some way in which it could be tweaked. Inevitably, someone will bring up the well-known git branching model blog post. They ask, why not use this git-flow workflow? It's very well laid out, and relatively easy to understand. Git makes branching easy, after all. The original blog post in fact contends that because branching and merging is extremely cheap and simple, it should be embraced.
As a consequence of its simplicity and repetitive nature, branching and merging are no longer something to be afraid of. Version control tools are supposed to assist in branching/merging more than anything else.
But here's the thing: There are reasons beyond tool support that would lead one to want to encourage or discourage branching and merging, and mere tool support is not reason enough to embrace a branch-driven workflow.

Let's take a moment to remember the history of git. It was developed by Linus Torvalds for use on the Linux project. He wanted something that was very fast to apply patches, and supported the kind of distributed workflow that you really need if you are supporting a huge distributed team. And he made something very, very, very fast, great for branching and distributed work, and difficult to corrupt.

As a result git has many virtues that align perfectly with the needs of a large distributed team. Such a team has potentially long cycles between an idea being discussed, being developed, being reviewed, and being adopted. Easy and fast branching means that I can go off and work on my feature for a few weeks, pulling from master all the while, without having a huge headache when it comes to finally merge that branch back into the core code base. In my work in ZooKeeper, I often wish I bothered to keep a git-svn sync going because reviewing patches is tedious and slow in svn. Git was made to solve my version control problems as an open source software provider.

But at my day job, things are different. I use git because a) git is FAST and b) Github. Fast makes so much of a difference that I'm willing to use a tool with a tortured command line syntax and some inherent complexity. Github just makes my life easier, I like the interface, and even through production outages I still enjoy using it. But branching is another story. My team is not a distributed team. We all sit in the same office, working on shared repositories. If you need a code review you can tap the shoulder of the person next to you and get one in 5 minutes. We release frequently; I'm trying to move us into a continuous delivery model that may eventually become continuous deployment if we can get the automation in place. And it is for all of these reasons that I do not want to encourage branching or have it as a major part of my workflow.

Feature branching can cause a lot of problems. A developer working on a branch is working alone. They might be frequently pulling in from master, but if everyone is working on their own feature branch, merge conflicts can still hit hard. Maybe they have set things up so that an automated build will still run through every push they make to that branch, but it's just as likely that tests are only being run locally and the minute this goes into master you'll see random failures due to the various gremlins of all software development. Worst of all, it's easy for them to work in the dark, shielded from the eyes of other developers. The burden of doing the right thing is entirely on the developer and good developers are lazy (or busy, or both). It's too easy to let things go for too long without code review, without integration, and without detecting small problems. From a workflow perspective, I want something that makes small problems come to light very early and obviously to the whole team, enabling inherent communication. Branching doesn't fit this bill.

Feature branching also encourages thinking about code and features as all or none. That makes sense when you are delivering a packaged, versioned product that others will have to download and install (say, Linux, or ZooKeeper, or maybe your iOS app). But if you are deploying code to a website, there is no need to think of the code in this binary way. It's reasonable to release code behind feature flags that is not complete but flagged off, for purposes of keeping the integration of that new code in for testing in other environments. Learning how to write code in such a way as to be chunkable, flaggable, and almost always safe to go into production is a necessary skill set for frequent releases of any sort, and it's essential if you ever want to reach continuous deployment.

Release branching may still be a necessary part of your workflow, as it is in some of our systems, but even the release branching parts of the git-flow process seems a bit overly complex. I don't see the point in having a develop branch, nor do I see why you would care about keeping master pristine, since you can tag the points in the master timeline where you cut the release branch. (As an aside, the fact that the original post refers to "nightly builds" as the purpose of the develop branch should raise the eyebrows of anyone doing continuous integration.)  If you're not doing full continuous deployment you need to have some sort of branch that indicates where you cut the code for testing and release, and hotfixes may need to go into up to two places, that release branch and master, but git-flow doesn't solve the problem of pushing fixes to multiple places. So why not just have master and release branches? You can keep your release branches around for as long as you need them to get live fixes out, and even longer for historical records if you so desire.

Git is great for branching. So what? Just because a tool offers a feature, and does it well, does not mean that feature is actually important for your team. Building a whole workflow around a feature just because you can is rarely a good idea. Use the workflow that your team needs, don't cargo cult an important element of your development process.

26 comments:

  1. Whatever git flow is used has to align with whatever deploy flow is being used. The flow you reference is clearly for rare deploys (monthly or more). It's not my preference, but for people working in that context, feature branches can be very helpful.

    In the other direction, not even a release branch is necessary as it can be simulated by the checkout of master on the deploy server.

    Source control workflows and deploy workflows are part of the same thing. They can't mixed and matched when they don't fit and it's necessary to update one when the other changes.

    ReplyDelete
  2. Yes, this is exactly right. Unfortunately many developers seem to read the git-flow blog post referenced here and take it as The Way without realizing that it is in fact only applicable for a specific scenario.

    ReplyDelete
  3. I don't think I've seen anyone put forward the argument "git supports feature branching, therefore you should use feature branches". It's more "git supports any workflow, whether or not it involves feature branches, therefore you should use git".

    I can't imagine *not* taking advantage of the ease of branching in git though. In addition to the work I do at my day job, I'm involved in an open source project and I use git for both. I'd be half as productive if I didn't make use of branches at least for local development.

    But certainly every team needs to come up with the workflow that best suits its own needs, and then choose a tool that supports it, not the other way around. It just so happens that many workflows do involve branching, and git makes this easy, whereas other VCSs make it a complete nightmare. At my day job we decided that git-flow didn't in fact suit our particular needs, and we now use the branching strategy described here: http://dymitruk.com/blog/2012/02/05/branch-per-feature/ It's working pretty well for us.

    ReplyDelete
    Replies
    1. I wrote this post because I have folks constantly putting forward the argument that git supports branching -> feature branching should be part of the workflow. So whether or not you've seen it, I hear that argument enough to write a blog post on why I don't like it so I can save myself some typing next time it comes up.

      You make a great point about branching being easy making local branching productive for local development, which is totally true and a great feature of git. But it's sort of orthogonal to whole team workflow requiring lots of branching.

      Delete
  4. Right on! The "git branching model" is simply the textbook description of the waterfall development process.

    ReplyDelete
  5. A developer is refactoring code. You come along and ask him to fix a production bug. There is no flag. How do you address this situation?

    I suppose you could flag every change you did to the code and then remove it and the old code once it was in and stable, but what is the advantage over branching?

    ReplyDelete
    Replies
    1. A major refactoring is a rare exception case, because it is not a feature but it (occasionally) is an act that crosses releases. There's not a great solution to that case, and I am not suggesting that you never ever branch, merely that branching isn't something that regularly needs to happen as part of an agile development flow.

      Note that if you're branching for a major refactor, you've got a lot of pressure on to make sure that all changes are pulled into your branch and it doesn't start to conflict heavily with master. Honestly even with branching major refactoring that crosses releases is likely to be a nasty situation if other changes are happening to the code base simultaneously.

      Delete
  6. The title of this post is puzzling, what does this discussion have to do with Agile?

    ReplyDelete
  7. It seems entirely unrelated to "Agile", merely buzzwords to draw readers in.

    ReplyDelete
    Replies
    1. A big part of agile is continuous integration. Branching, as often implemented, makes true continuous integration hard. I believe that fundamentally a development approach that relies heavily on long-lived branches is generally not an agile flow.

      Delete
  8. Thanks for the post, I couldn't have said it better!

    ReplyDelete
  9. So to summarize your opinion, branching is too much overhead for Agile?

    ReplyDelete
  10. Great post!

    So gitflow(tm) introduces the long lived development branch. Any developer worth their salt knows that long lived branches are inherently evil. Things get into the branch early, then for some reason, not released. So now the development branch is different from prod (master). This causes all sorts of problems as new code introduced to the branch depends on code not in master. Hilarity ensues, but actually it is not funny.

    The correct way to work is master is what is in production. Period. All new work is done on a branch from production (master). The branch goes through the complete QA cycle. Any discrepancies can be compared against master with the same data, since no other branches are involved. Once the branch is approved, it can be merged with other tickets into a new release branch. That branch is then QA'ed. Any problem tickets can be removed and a new release branch can made from the remaining tickets and tested. Once the release branch is approved, it is merged to master and deployed.

    Any branches existing after a deploy are rebased against or merged with the new master branch. Hotfixes are done against master and merged immediately. Outstanding branches are rebased or merged against master. All branches are based off master at all times. The release branch only exists while it is being QA'ed as a whole, as every ticket was QA'ed alone.

    This is agile. Adapt or die.

    ReplyDelete
    Replies
    1. Ok, we switched to agile and CI. But how do you support several prod releases of a framework which are in turn being used by different development projects and even older releases used by apps already in production? How do you promote hot fixes to some, but not all? And how do you downport new features to those releases? Does CI cater for this?

      Delete
    2. Each time you release, you should be creating a release branch - i.e. release/v6.2.x. You would patch that branch as you are releasing, and once the final release is ready to go LIVE (i.e. 6.2.45), you tag the repo with that release number (git tag v6.2.45), then merge the release branch into master, and DELETE the release branch.

      You can then use that tag to check out the production code at any point for hotfixes (i.e. git checkout -b hotfix/v6.2.46 remotes/origin/tags/v6.2.47) and promote that hotfix as necessary.

      Each disparate version you are hosting in a live environment would need its own hotfix - you'd just check out the relevant tag and do the fix. If you're lucky, a single hotfix may be possible for different versions - in that case, you could fix in one hotfix branch, then cherry-pick the commit into the other hotfix branches for testing.

      I assume by downporting you mean downgrade? The best way to achieve this - unless you are /very/ careful in your release and feature orchestrations - is to check out the tag, then revert the changes. You could revert / reset commits, or manually make the changes and make an additional commit. Either way, I would recommend increasing the version.

      If you want to downgrade to a previous version, a third way would be to simply check out the old tag in a release branch, then push the release branch to your CI / CD server.

      Delete
  11. I agree with everything you say about git-flow, apart from the dismissal of feature branches. They are required in order to keep your features separate until they are ready to be merged into the main pipeline / deployed (regardless whether you have CD or just CI).

    What if a developer is working on a 3 week long feature, only to be asked in week 2 to park it and work on something else? If they worked directly in master, they would have half a feature developed and deployed. Stashing only goes so far, but feature branching solves the problem entirely - whether your working on a binary distribution or a website.

    This also rings true if a developer wishes to follow the recommended "Commit small, commit often" mantra. You may be in the middle of a feature (with several commits), but need to push the changes in order to work on them on another machine; I do this a lot - I switch between a VM for work, my laptop, and my desktop PC at home. I need access to my feature branches before they're ready for other developers, and pushing my commits to master is absolutely not acceptable.

    Yeah, feature branches can become stale, and they can be overly complex for simple developments, but when you have anything even remotely complex you have to learn to deal with this, and feature branching is the only sensible way to do it.

    There's also the commit history argument - I may very well make 200 commits in a feature branch; when I want to merge that feature into master the best way is often to rebase my feature branch on master, fix any resulting "gremlins" - as you put it, then merge my feature into master (I can even squash those commits into one single commit for the entire feature if I want). That means my commit history is clean, readable and maintainable.

    ReplyDelete
    Replies
    1. For completeness, I don't use "develop" branches either - master with release tags, plus release and feature branches. Develop is completely superflous.

      Delete
    2. So, when you get pulled off in week 2 to work on something else, what happens to that feature branch? Are you continuing to update it with changes from master? Are you keeping it in sync so that when you finally get back to it, you can pick up where you left off?
      I just don't think that an environment where you have a lot of small changes works well simultaneously with people going off on feature branches for multiple weeks. If you're getting pulled off the feature after 2 weeks, it is better to have the code written in master so that it is being kept up to date for when you return. If the code base is slow-moving or has fewer developers overlapping their work, feature branches can work fine (aka, the OSS process that I explicitly mentioned in the post). But for most product development teams, this is not the case.
      Hopefully, getting pulled off of a 3 week project 2 weeks in is a very rare occurrence. If not, you've got bigger problems than the way git is being used!

      Delete
    3. Generally, you wouldn't update your feature branch from master. You'd merge it into master once complete - then sort any problems out. Alternatively you /can/ rebase your feature branch on master before you do the merge - which is more sensible (as long as your feature branch isn't shared - in that case, you would just merge master into your feature branch and put up with the potential additional commit). You wouldn't "keep in sync" until you needed to work on it again - and even then, it's not completely necessary - depending on what you're working on.

      Of course, ideally you don't want to be pulled off a feature, but business needs are usually the driver - depending on the type of company you work for.

      Regardless of the amount of time you're working on a feature branch, feature branching is a must. You could spend an hour working on a bug fix, only for another urgent bug fix to come in that needs to be added to the next release immediately. Mind, that kind of work is usually done on release branches, or as a hotfix taken from your release tag.

      Delete
    4. I see that you think feature branching is a must but I have not seen that to be the case and your argument does not seem to even address that conclusion let alone support it.

      Delete
  12. This comment has been removed by the author.

    ReplyDelete
  13. Hi. I agree with you that GitFlow is not natural. Could you make a blog of another branching strategy? I was thinking that it would be more natural to use master branch be the (dev branch) and instead have productions branches. That way you avoid the noobs that checkout and pushes directly to the master branch.

    ReplyDelete
  14. This is nonsense. Branching strategies are neither agile nor non agile. Agility has to do with ability to change, not how one branches ones code.

    ReplyDelete
    Replies
    1. The way we work has a direct impact on our ability to change, and some ways of working make change easier than others.

      Delete

Note: Only a member of this blog may post a comment.