Continuous Staging: Scaling Continuous Integration to Multiple Component Teams |
| Print | |
| Written by Brad Appleton, Steve Konieczka, Steve Berczuk |
| Monday, 01 March 2004 16:00 |
|
This month we will discuss some of the difficulties encountered when attempting Continuous Integration for multiple component teams working together to develop a large system. We describe the concept of a Staging Area to help coordinate the teams and stabilize the interdependencies between built versions of components.
Continuous Integration for Component-Based Development In small team development, the practice of Continuous Integration [2] is an effective technique for keeping every one on the team coordinated with the latest results of everyone else's changes. Practicing Continuous Update[3] and Private Build[4] in one's Private Workspace[5], as part of a Two-Phased Codeline Commit strategy helps ensure workspaces and work tasks stay "in sync" while Integration Builds ensure the team's codeline remains stable, consistent, coherent, and correct. Any built version of a component that needs to be accessed by internal stakeholders (such as a QA/V&V group) needs to be identified (e.g., using a label/tag). This ensures that anyone who needs to look at it (even after it is no longer "latest and greatest") can easily do so, and know which version of the component and its corresponding source code they are looking. In an ideal world, we could build the entire system directly from the sources in a one-step process, for everyone working on any component. And (again) ideally - we have the storage capacity, and network bandwidth, and processing power, and load distribution, to build the whole system (or at least incrementally build the whole system) every time before a developer commits their changes to the codeline. Sometimes, for reasons of build-cycle-time, or network resource load, or schedule coordination (e.g., multiple time zones, or interdependent delivery schedules of components), or other reasons, this is not always feasible. And what happens when I have multiple teams and multiple components, each with their own integration schedule or "rhythm" and needing to coordinate with a larger-grained system integration strategy? There are many dependent factors to consider, including:
System Integration for Multiple Component Teams Many large projects and systems require multiple teams of people to work together. In component-based development and elsewhere, it is common practice to see a system partitioned into multiple subsystems and/or components, with a team allocated to each component of the larger system (a component team [1]):
Each team should be responsible for ensuring it delivers working code and executables to other internal stakeholders. When a Two-Phased Codeline Commit protocol is used, each Task-Level Commit to the codeline is essentially signing and sealing that you have successfully compiled and tested the entire component with your changes incorporated. If the team is large or dispersed enough to actually warrant "subteams", it may become necessary for each component-team to deliver "tested, closed, sealed and signed, versioned binary libraries" to the rest of the component teams. If the repository contains several components, and people build only their components to test their changes (rather than the entire system), then they are ensuring they have "tested, closed, sealed and signed" deliverables only for that one specific component! Using a Staging AreaA common best practice used to coordinate cross-component build dependencies is commonly called a "Staging Area" or "Staging Environment". A Staging Environment is like a "sandbox" or "workspace" reserved for sharing build/test dependent artifacts (headers, libraries, executables, etc.). It works something like this (note, this is not specific to XP/Agile development):
Figure 1. Delivering Built Components into a Staging Area
Staging AreaIf versioning is required, then it is typically handled one of two ways: Staging AreaStagingRepositoryStaging Area Staging Area Implementations and VersionsThe issue of versioning comes into play if it is necessary to know which version of a component is the "current" one in the staging area. If necessary, then:
Staging Directory: a separate directory tree in the repository is used to house any "installed" staging artifacts (staged artifacts). Developers will typically use the staging area plus the top-level directory for their own components in their sandbox (and don't extract/checkout anything else unless and until they need to view the source for something outside their owned component).
Figure 2. Staging Directory Versioned in same Repository as Components Staging Repository: a separate repository is used to house all staged artifacts. It can therefore accommodate separate sets of versions and tags/labels (which has good points and bad points)
Figure 3. Staging Repository and Separate Component Repositories Sometimes granularity of access-control, administration, mirroring/synchronizing will determine which of the above two approaches is best. If each component is large enough to already warrant its own separate repository from the others, then a separate staging repository is typically used. Sometimes, for local performance, a staging area might be mirrored or replicated to local sites/storage to cut-down on network bandwidth for their build-cycle time. Making it "Agile"The Staging Area is a specific technique for separating (sub)team build-dependency interface from implementation for the benefit of the rest of the team(s). A Staging Environment is the component-version "mediator" (coordinator really) that houses the common interface and necessary artifacts to satisfy build/test dependencies across subteams. How might we apply an "agile" adaptation of it? Well, the "simple" case is when no separate staging area is needed because the whole "one team" can peacefully co-exist in "one repository" and each work at their own "sustainable pace" without unduly impacting the others. So there is little need to think so much about subparts and subteams and instead more easily focus on "the whole" Other times, factors of scale rear their ugly head! These may be issues of system/build scale, organization and organizational process, issues of ownership over computing resources, etc. (or maybe not all the subteams are using "agile" and some of them can't tolerate such high-frequency of changes/deliveries from the agile-teams into their own part of the repository. One of the key problems to solve is when and how-often a subteam should do a "signed and sealed" delivery into the staging area. If every commit to the codeline is too frequent for a staging delivery, then an arrangement must be negotiated with the other subteams. This is where some agile methods try to "scale" by using a team of teams (e.g. a "Scrum of Scrums") to manage the staging frequency and coordination. Scaling Continuous Integration up to Continuous StagingIf it is necessary to "scale-up" my build process & resources to use a Staging Environment, how might I "scale-up" a practice like "Continuous Integration" to approximate "Continuous Staging" into the staging area? This would avoid, or at least minimize the need to tag/label every delivery into the staging area, and hence minimize the need to manage build-version-dependencies between components and the subteams that work on them. Even if it were no longer practical to use a single repository or full system build for every commit across the whole team, it might be feasible to:
Even in those cases where you might still need to "version" the repository (and component) for what you delivered to the staging area, the staging area itself can be used to manage the current latest and greatest set of "system buildworthy" components and their versions (both source and binaries). Component Versioning and ReleasingIf the development of all your components results one coordinated release of a single application or system, then it maybe best to version the source files, not the compiled libraries. Even when "versions" are associated with what gets delivered to the staging area, they typically refer to versions of the source that produced the "staged" deliverables. If however your result is really an overall product-family or product-line of multiple components that feed into multiple products for multiple deliverable systems, then the component/library reuse and independent component release schedules may make it necessary to version the binary/library releases. In the latter case of a product-family, each component release is essentially a release of a third-party component to each of the other component teams. The vendor release in this case originates from elsewhere within the organization (rather than an external supplier), but the underlying business model for reuse and release of components matches that of third-party vendor/supplier (albeit an internal one). There are truly external vendor and third-party deliverables, and then there are items that may be internal to your organization, but should be regarded as internally vendor/3rd-party supplied to your particular product and team. And in those cases versioning the delivered binaries is recommended. Some shops use a separate Third Party Repository for such purposes. One of the reasons is because its supplier and release schedule is independent of the rest of the application. Another reason is that if most of the elements are binary in nature, it is often desirable to have a distinct storage area with more efficient storage parameters/capacity (and sometimes the repository can be configured so it is "tuned" for performance based on knowledge of the kinds of elements it will predominantly store). And of course, if you get code delivered from any of those third parties, you probably want to version it along with the delivered binaries unless the binaries can be reproduced from the code you were given (sometimes the source delivered is insufficient for that). If you have to modify any of that code from the third-party for your own custom, value-added purpose, you will probably want to use the Third Party Codeline pattern (plus, get them to incorporate your changes, unless the organization deems them proprietary and is unwilling to submit them back to the vendor). References
Acknowledgements
Brad Appleton is co-author of Software Configuration Management Patterns: Effective Teamwork, Practical Integration. He has been a software developer since 1987 and has extensive experience using, developing, and supporting SCM environments for teams of all shapes and sizes. In addition to SCM, Brad is well versed in agile development, and cofounded the Chicago Agile Development and Chicago Patterns Groups. He holds an M.S. in Software Engineering and a B.S. in Computer Science and Mathematics. You can reach Brad by email at brad@bradapp.net Steve Berczuk is an Independent consultant who has been developing object-oriented software applications since 1989, often as part of geographically distributed teams. In addition to developing software he helps teams use Software Configuration Management effectively in their development process. Steve is co-author of the book Software Configuration Management Patterns: Effective Teamwork, Practical Integration. He has an M.S. in Operations Research from Stanford University and an S.B. in Electrical Engineering from MIT. You can contact him at steve@berczuk.com. His web site is www.berczuk.com Steve Konieczka is President and Chief Operating Officer of SCM Labs, a leading Software Configuration Management solutions provider. An IT consultant for 14 years, Steve understands the challenges IT organizations face in change management. He has helped shape companies' methodologies for creating and implementing effective SCM solutions for local and national clients. Steve is a member of Young Entrepreneurs Organization and serves on the board of the Association for Configuration and Data Management (ACDM). He holds a Bachelor of Science in Computer Information Systems from Colorado State University. You can reach Steve at steve@scmlabs.com
Set as favorite
Bookmark
Email this
Hits: 7852 Trackback(0)Comments (0)
|
| Last Updated on Thursday, 09 November 2006 10:44 |