Featured Whitepapers
- Forrester Research: Optimizing Globally Distributed Software Development Using Subversion
- An Integrated Approach to Requirements and Quality Management
- Continuous Testing With ElectricCommander
- Agile CMMI at a Large Investment Bank
- Realize Effective Distributed Development Via a Virtual Software Factory
- Build & Deployment Automation for the Lean Economy
Upcoming & Recent Webcasts
- A New Kind of Engineering
- Managing Change in Rugged COTS Systems Development
- Keeping Control of Costs and Schedules When Requirements Change
- Three Simple Things that Will Help You Adopt Agile in Your Enterprise
- Customer speak: Teams, Insights, Results with Quality Driven Software
- Build & Deployment Automation for the Lean Economy
Branching and Merging - An Agile Perspective |
| Print | |
| Written by Robert Cowham |
| Tuesday, 15 November 2005 16:00 |
|
Two years ago this month we published "Codeline Merging and Locking: Continuous Updates and Two-Phased Commits" (which has been reprinted this month using the CM Journal forum that wasn't yet in use at that time). We would like to build upon it this month and particularly focus on branching and merging. We present some background for branching and merging, and consider some of the implications for agile development in particular. We also hope to reduce some of the suspicion that many agile developers have of branching. The article assumes some overall branching knowledge and yet revisits some particular details that often seem to confuse people. This is a fertile area which we will continue to expand on in future articles. Branching and MergingBranching is a mechanism for isolation and parallel development which is useful in many situations as discussed below. The advantage of branching with good tool support is that it allows you to merge changes between branches more easily, and in a less error prone manner than manually making the same code changes to the two separate branches (see Introduction to Branching from Streamed Lines). There is little lasting value in branching if the branches will diverge indefinitely without propagating any changes. It may be seen as a short-term win that allows a new project to get up and running very quickly from an existing codebase for another product. But the gains of such persistent variants are very short-term, and the resulting long-term costs of proliferating code and effort across the same codebase can quickly overwhelm whatever short-term gain is to be had. So whenever we branch we almost always will need to merge back to another branch. Merging changes between branches does not come for free, and the associated costs and implications need to be carefully considered when deciding when and how to branch. Agile Branching and Merging Considerations Early and frequent feedback is a key element in all agile methods. Practices such as Continuous Integration help ensure integration issues are never left to fester for a long periods of time and then come back and bite you at inopportune moments (just when you are trying to get a release out the door), but instead confronted and resolved on a regular (at least daily) basis. Some agile proponents strongly oppose branching, preferring to use alternative solutions for some scenarios where branching is usually a better fit, such as maintaining multiple releases. Of course, branching is not a panacea, but the need to branch often stems from business reasons rather than technical ones. Maintaining multiple releases or custom variants is typically a business decision arising from a contractual agreement, or Service-Level Agreement (SLA), that is often a source of considerable additional revenue. So even though multiple maintenance may come at additional cost, it may often be fed by an additional revenue stream for that specific purpose. Branching is quite often the best way to meet those needs, and there are ways to branch that still allow us to be "agile" and focus upon delivering business value to our customers. We have seen a number of projects work for considerable periods of time (months if not years) using a single mainline with no branches. The fact that they could is a testament to the soundness of their practices. Yet we as SCM professionals regularly encounter misunderstandings of what version branching and merging is all about, specifically what the state-of-the-art is as regards tools and processes. Of course we don't want to fall into the Branch-a-holic trap, nor do we want to feed the fear that manifests itself as Merge-a-phobia in the reluctance of some people to use branching approaches when they are applicable. With any luck, this article will allay many of the fears and trepidation on both ends of the branching-and-merging spectrum! Automatic Merging The state-of-the-art of software merging tools has come a long way. However, it bears repeating that there is no such thing as a 100% reliable merge program. Human intervention is invariably required at some point (almost reassuring in that there will still be some jobs around for the foreseeable future). All automated merge programs have their strengths and weaknesses but most are performing merges at the level of blocks of text. There is usually no difference in the algorithm for used for merging two text files containing poetry and two text files containing C++/Java or any other language you care to mention. Indeed, once it is possible to perform 100% correct merges at the semantic and syntactic level, the merge system may well be able to write whatever program we need 5 minutes before we know we needed it! Having said that, we believe that the progress made in IDEs which support refactoring for some languages will support merging at the level of the parsed language tree rather than just purely at the textual level (at least for languages such as Java). For example, renaming a variable is not significant regarding the meaning of the resulting code, but can mean many different individual lines have changed between two versions. If one person renames the variable and another edits lines which use the old name, then the real changes will be obscured by "conflicts." Some Detailed Merging Examples For clarification, let us look at some detailed examples of types of merge - a couple of examples clarifies pages of theory! The catch-up/publish terminology used here was first referenced in our article Tasks and Branching Patterns in the Symbian case Study. Laura Wingerd presented at a recent conference on the Flow of Change (a chapter in her new book from O'Reilly). She introduces the concept of the "Tofu Scale" to describe the properties of codelines and how firm or soft they are! If you draw firmer codelines above softer ones (in the diagrams below the mainline is firmer than the personal branch), then catch-up becomes "merge down" and publish becomes "copy up."
In figure 1, the file starts out with contents [A,B,C,D] (where each letter represents a block which may be 1 line, 10 lines or any number of sequential lines in the file) and is branched to the personal branch which is just a copy. On the mainline, B is changed to B2, and on the personal line A is changed to A1. Performing a "merge down" gives [A1,B2,C,D] (assuming for now that the two changes are compatible). The merge down performs the (relatively) risky merge in the codeline which is "softest" on the Tofu scale! In this case, the merge is not that risky. The publish operation is simply a "copy up" of the resulting file back to the Mainline, which now includes the change.
Figure 2 shows a file that is changed only in the Mainline and brought into the personal Task branch so that it can be tested together with all other changes from the Mainline, including those that required merging. Note that the publish step is not required, since in this case there is not further change to go back.
In figure 3, we throw in a conflicting change such that B1 and B2 clash, needing to be manually edited and turned into B3. This sort of merge is more risky, and would require to be built and tested locally (Private System Build and Smoke Test) before checking in. However, the publish is the same as before - just a copy from source to target.
In the first 3 examples, the merge into the personal branch is done using normal merge facilities and may require checking and certainly local build/text before checking it in. However, in all examples, the publish is just a copy, which should be fully scriptable in any reasonable SCM tool (including some straightforward validation to ensure that it really is a copy and that nothing has changed on the mainline since the last catch-up). Thus the script must catch the example in Figure 4 where there is still a Catch-up needing to be done, and thus the Publish is rejected.
Figure 5 shows the sequence of events for a single file (Mn is the name/revision on the Mainline, Fn on the Task branch). Of course behind the scenes the script would need to branch all the files in the change-set Fred was working on so as to have a consistent change-set in the Task branch. The naming of the Task branch could be created automatically using some simple heuristics, possibly including the date/time of the change or some similar identifier. While this might at first sight appear complicated it is not difficult to do and the results are the storing of the stable implemented function which would be very easy to remove cleanly if future business needs dictated. Steve Berczuk develops software applications at Iron Mountain in Boston, MA. He has been developing object-oriented software applications since 1989, often as part of geographically distributed teams. In addition to developing software he helps teams use Software Configuration Management effectively in their development process. Steve is co-author of the book Software Configuration Management Patterns: Effective Teamwork, Practical Integration. He has an M.S. in Operations Research from Stanford University and an S.B. in Electrical Engineering from MIT. You can contact him at steve@berczuk.com. Brad Appleton is co-author of Software Configuration Management Patterns: Effective Teamwork, Practical Integration". He has been a software developer since 1987 and has extensive experience using, developing, and supporting SCM environments for teams of all shapes and sizes. In addition to SCM, Brad is well versed in agile development, and cofounded the Chicago Agile Development and Chicago Patterns Groups. He holds an M.S. in Software Engineering and a B.S. in Computer Science and Mathematics. You can reach Brad by email at brad@bradapp.net Robert Cowham is the founder of Vaccaperna Systems providing SCM consultancy and training to organisations. With 20 years of experience in software development, he has long had an interest in SCM, and has worked with clients around the world in this arena during the last 7 years. He is the new Chair of the CM Specialist Group of the British Computer Society. He has a BSc in Computer Science from Edinburgh University and is a Chartered Engineer (CEng MBCS CITP). You can contact him at rc@vaccaperna.co.uk
Set as favorite
Bookmark
Email this
Hits: 11602 Trackback(0)Comments (0)
|
| Last Updated on Friday, 18 September 2009 08:10 |

Figure 1 - Simple Use of Personal Branch with Merge Down/Copy Up paradigm.
Figure 2 - File not changed in Personal branch - change brought in from Mainline
Figure 3 - Resolving a conflict
Figure 4 - A Failed Publish
Figure 5 - Branch on Conflict
