CMCrossroads
Published on CMCrossroads (https://www.cmcrossroads.com)


[article]

Lean Development Principles for Branching and Merging

By Robert Cowham, Brad Appleton, Steve Berczuk - July 18, 2007
Summary:
By reworking lean principles for the branching and merging arena, we're able to create automated builds and unit tests to increase effectiveness and improve quality in software configuration management. Individual developers and teams alike can benefit from this process-improving strategy.

By reworking lean principles for the branching and merging arena, we're able to create automated builds and unit tests to increase effectiveness and improve quality in software configuration management. Individual developers and teams alike can benefit from this process-improving strategy. With lean principles, we can"

  1. Eliminate Waste - Eliminate avoidable merge-propagation (multiple maintenance), duplication (long-lived variant branches), and stale code in infrequently synchronized workspaces (partially completed work). Detect these sorts of situations using some judicious metrics (discussed further below).
  2. Build Quality In - Maintain codeline integrity with (preferably automated) unit & integration tests and a Codeline Policy to establish a set of invariant conditions that all checkins/commits to the codeline must preserve (e.g., running and passing all the tests :-)
  3. Amplify Learning - Facilitate frequent feedback via frequent/continuous integration and workspace update.
  4. Defer Commitment (Decide as late as possible) -- Branch as late as possible! Create a label to take a "snapshot" of where you MIGHT have to branch from, but don't actually create the branch until parallelism is needed. See the example below on "branch on conflict".
  5. Deliver Fast (Deliver as fast as possible) -- complete and commit change-tasks and short-lived branches (such as task-branches, private-branches, and release-prep branches) as early as possible and merge back to the mainline.
  6. Empower the Team (Decide as low as possible) -- let developers reconcile merges and commit their own changes (as opposed to some "dedicated integrator/builder"). Educate and train developers in patterns and their tools such that they are able to select the most appropriate pattern and apply it.
  7. Optimize the "Whole" -- when/if branches are created, use the Mainline pattern to maintain a "leaner" and more manageable branching structure. Use an appropriate balance of methods, some of which conflict.

Much of the above is fairly obvious, and yet there are some implications and advice that we can perhaps tease out.

Metrics for Waste and Whole Process Optimization
Principles such as Eliminate Waste, Deliver Fast and Optimize the Whole all benefit from the use of appropriate metrics to measure and track what is happening and allow feedback on the process (Amplify Learning) to be implemented.

Useful metrics (this is usually easier in those tools which track such information centrally) include:

    • Changes done in Task Branches vs Changes in the Mainline - are small tasks done directly in the mainline? How many changes does it take to implement tasks in different branches - is there a pattern?
    • Changes not yet merged from Task Branches to Mainline (WIP). Ensure this doesn't get too high.
    • Age for Changes not yet merged (how stale)
    • Files checked out in Workspaces (by age) - well worth keeping an eye on. Its amazing how many old workspaces can lurk around owned by people who have long since left the company. Is your SCM tool linked in to the HR system to manage company leavers (i.e. included in user permissions to revoke or remove access)?
    • Number of conflicts - either for Task Branches or for Workspace updates - indicates which to use, or perhaps even some repartitioning of the code to reduce conflicts.

Collection of all metrics needs to be automated, and should preferably require no extra work on the part of the developers creating branches or checking in changes.

Some people use automated scripts and yet with a configuration file which indicates perhaps the current set of active branches to be processed (mined for data). Consider if an appropriate repository structure, or branch naming convention can be used which is sufficiently regular to allow scripts to automatically deduce the presence of new branches by their location in the repository. For example, in tools such as Subversion, Perforce or Team Foundation Server, branches exist in the path space of the repository - make sure their location and the naming standard used is regular enough to be automatable.

Task Branches - Help or Hindrance?
Task branches are aimed at allowing changes to be made independently from a mainline and merged back as one complete unit. They permit frequent check-ins which may contain value, yet which are perhaps not fully tested and thus risk breaking the mainline.

There are many organizations, particularly agile development shops, which do not like to use Task Branches at all due to the perceived overhead of merging changes, and the dangers of delaying changes. They prefer to make all changes on the mainline and to deal with conflicts within the workspace.

As we mentioned in our original Branching article [1], there is in fact no difference between resolving a conflict which results from a Private Workspace Update, and the conflict which results from merging between two branches. In the same article we suggested the possibility of branching on conflict (the equivalent of the Private Checkpoint pattern) to ensure that the original workspace version is saved in the repository. This was specifically to address the requirements of an agile project.

Automated Merging Between Branches
This may seem rather dangerous to many people, and yet it is well worth considering.

To be able to do it at all will require good development practices and tools:

    • Reliable automatic merge tool
    • Developers checking in consistent change sets
    • Automated builds and unit tests to provide suitable quality guarantees

Of course it is not possible to automate 100% of merges, and there will always be the need for developers to get involved to resolve conflicts or failed tests.

The more you keep your change sets small and consistent, and merge each one individually, not in one big lump, the more likely it is that automated merge will be of benefit. In addition, the cleaner the code under development, the easier life will be too!

In the worst case scenario a subtle bug might be introduced. Some organizations are so fearful of this that they ban it outright, but bugs happen often enough with ordinary developer changes anyway, so the overall increase in productivity or automated merging (inspite of occasional introduced issues) is likely to be significant (consider appropriate metrics to track this).

Chris Berarducci describes the use of automated particularly in maintaining localized versions of a product. While the original article was written in 2003, it is still in use within Palm and of major benefit.

rcjul07-1

Critical to the "Merge as you go" success are:

    • The SCM Tool's merge facilities
    • Established roles and responsibilities
    • A single automated daemon
    • Conflicts can be handled on the engineers CPU

Development engineers like it:

    • They own the configuration and merging; they do not need to consult with another group or person to turn on or off the auto merge daemon.
    • They are relieved from thinking about merges unless there is a conflict or configuration change needed. When it is on and configured, changes checked into one tree will be migrated to the destination tree(s) in a timely and reliable manner.
    • It's easy to set up and the configuration files are tracked in the SCM tool.
    • It's easy to turn off and on

More information is provided in the submission comments:

  • Easier for QA and Program Management to generate release notes, and for other non-technical users to understand changes and their history

While the system as defined works very well, Palm are looking at managing conflicts via a centralized web interface - this would allow conflicts to be resolved from pretty much anywhere and by anyone and would reduce the overhead.

There is some similarity with this situation and the centralized code review system which has been implemented by Guido van Rossum at Google [3].

Conclusion
Branching and merging are a key practice in Software Configuration Management, and many organizations do not get the best value out of these practices. Applying Lean Principles can make a significant difference to your effectiveness.

The principles, and indeed the mindset, are key factors - if you are aware and looking for possibilities to improve your process you will find them (and most of the time they will not be difficult to implement). If you don't look and just rely on your tools support, or individual developers or teams to address this area, you are missing out big time.

We are keen to learn of more examples of good practice - let us know and continue to share.

References

    • An Agile Perspective on Branching and Merging, CM Journal, 2005 , by Robert Cowham, Brad Appleton and Steve Berczuk
      • Merge as You Go, Chris Berarducci, Handspring/Palm,
        • Google Code Review Process, Guido van Rossum

        About the author

        Robert Cowham's picture

        Robert Cowham

        Robert Cowham has long been interested in software configuration management while retaining the attitude of a generalist with experience and skills in many aspects of software development. A regular presenter at conferences, he authored the Agile SCM column within the CM Journal together with Brad Appleton and Steve Berczuk. His day job is as Services Director for Square Mile Systems whose main focus is on skills and techniques for infrastructure configuration management and DCIM (Data Center Infrastructure Management) - applying configuration management principles to hardware documentation and implementation as well as mapping ITIL services to the underlying layers.

        About the author

        Brad Appleton's picture

        Brad Appleton

        Brad Appleton is an Enterprise Agile+DevOps leader, coach & manager, and seasoned DevOps/ALM/CM solution architect at a large Fortune 100 company. Currently he helps organizations and teams scale, adopt and apply lean/agile development methods and DevOps/ALM/CM practices and tools. He is co-author of Software Configuration Management Patterns, a columnist for the AgileConnection (and CMCrossroads) at Techwell.com, the Streamed-Lines branching patterns, and a former section editor for The C++ Report. You can read Brad's blog at blog.bradapp.net.

        Connect with Me

        • http://linkedin.com/in/BradAppleton
        • https://twitter.com/BradApp
        • https://www.facebook.com/bradapp
        • http://bradapp.blogspot.com/

        About the author

        Steve Berczuk's picture

        Steve Berczuk

        Steve Berczuk is a Principal Software Engineer with experience as a manager, Scrum Master and technical lead in Boston, MA. The author of Software Configuration Management Patterns: Effective Teamwork, Practical Integration, he is a recognized expert in software configuration management and agile software development. Steve is passionate about helping teams work effectively to produce quality software. He has an M.S. in operations research from Stanford University and an S.B. in Electrical Engineering from MIT, and is a Certified ScrumMaster. Contact Steve at [email protected] or visit berczuk.com and follow his blog at blog.berczuk.com.

        Connect with Me

        • http://linkedin.com/in/steveberczuk
        • http://twitter.com/sberczuk
        • http://blog.berczuk.com
        ©2011-2013 TechWell Corp.