Top 10 Best Practices in Configuration Management

[article]

In his CM: the Next Generation series, Joe Farah gives us a glimpse into the trends that CM experts will need to tackle and master based upon industry trends and future technology challenges.

Summary:
Joe Farah identifies the top ten "best" practices in configuration management and goes even further by listing ten more runner-up practices.

There are a lot of best practices in the CM trade.  You might be hard-pressed to pick the best ones out.  But that's exactly what we'll try to do in this month's article.   You may see a few that you're quite familiar with.  There may be a few that are hotly contested issues.  I'll bet there's more than a few that you haven't fully considered before.  Of the dozens of CM/ALM best practices we use ourselves, I've tried to pick the top ten.  It was a difficult task, so I'll start by mentioning the 10 runner-ups.

20. Organize Shared Code into Separate Products

When you're producing software that is shared by different products, issues such as development ownership, timing of releases, etc. are typically difficult to manage because the different products utilizing the shared code have different sets of requirements, and possibly differing processes.  Often the shared code starts out as part of one product's
development but business logic dictates that it be shared in multiple products. 

The best practice in this situation is to partition the shared code into one (or more) separate products.  That product has as its customers, the other product teams.  Control of releases, frequency, features, etc. really have to be managed as a full product is
managed.  Different customers will want to move to new releases of the shared product at different times.  Some will want the new features, some won't want the risk.  The shared code may need to branch into different release streams to support the various customers.

19. Use Change Promotion instead of Promotion Branches

Branches are often used to track changes.  This strategy works, but it is generally used because of insufficient tool and process technology.  Promotion branches help ensure changes are promoted through the promotion levels dictated by the process.  However, they assume a pessimistic strategy which caters to the worst case rather than the most usual case.  As a result, there is a lot of merging, labelling, etc. that is unnecessary.  Roll-backs are also difficult.  As well, it's quite difficult to adjust the change process to support additional promotion levels, because of the cost of each promotion level in terms of merging/labelling, and because of the administrative issues involved with the insertion of new promotion branches among the existing ones. 

Ideally, you should use technology where promotion is simply a status change on the change object (assuming there is one).  With the appropriate processes and technology, merge operations should only be required when changes don't flow through the change promotion process in the same sequence from level to level.  So generally, the configuration management is reduced to simply change management - promoting/rollback the change status.  The technology should allow you to view configurations at each promotion level based on the change status alone.  So although this is a best practice, it is highly dependent upon the proper technology.

18a. Separate Customer Requests from Engineering Problems/Features

Customers want to raise problems, ask for new features, or simply specify how
they believe the tool should operate, without reference to whether it is a feature request or a problem report.  Some customers will ask for the same features as other customers, perhaps with a different twist or priority.   Customers want to know the status of their requests, many of which can be dealt with directly by the technical support team
without involving the engineering team.  Some are data configuration issues, others non-problems.  The customer just wants the majority of requests addressed successfully. 

The engineering team doesn't care about how many customers report a problem or request a feature - they just want their marching orders: a list of problems to be fixed
and a list of features to be implemented. Sure they're prioritized based on customer input, but ultimately, this reflects the value of the feature in the product.  It's too hard for customers to classify issues as features or problems, and it doesn't really matter anyway - it's input to the product team.  But it's critical to distinguish engineering problems from features as they are completely different processes.  All this to say, track customer requests separately from engineering problems/features.  That is not to say that you should use a separate repository. And of course the engineering records should be
traced back to the customer requests.

18b. Separate Problems/Issues/Defects from Activities/Features/Tasks

A related best practice that we touched on in the last paragraph is that problem fixing is a completely different process from feature development.  One is a fix to the product because it didn't meet the requirements.  No new requirement specification.  And the specification for the change is the problem report itself.  The first step is to ensure reproducibility.  The problem has to be investigated for potentially all supported development streams. And so forth. 

Feature development is completely different.  The feature has to be designed and fully specified.  New requirements are needed to put the case forward for the feature.  User documentation has to be updated.  A new set of test cases has to be established.  Problems and Features (or more generally activities) are different beasts.  Don't try to track them both as "tasks".  If you have separate ALM tools to deal with those processes, that's perhaps a different matter.  But in a fully integrated system, keep them separate.

17. Use Tags and Separate Variant Code into Separate Files

Avoid the temptation of using parallel branches to manage separate variants.  Consider that if you have 3 variant options for language, 3 more for security level and 2 more for packaging, you'll have 18 branches already.  Variant management must be addressed in the software engineering realm, not in the CM realm primarily.

The CM realm should allow you to tag files with the appropriate variant tags, and to
select files based on one or more specified variant tags.  In some cases, the CM tool might go so far as to allow you to tag specific changes as "variant" changes, that can be "automatically" applied to your source code when that variant is selected. 

But the bigger task of the CM manager is to make sure that the development
organization understands how to deal with variants.  The preference is run-time configuration.  Next to that is link/load time configuration (by selecting the files to load or to link together).  Then comes compile-time variants, also known as conditional compilation.  This is less appealing because you then need different variants of the same object file, making your build task more complex (often the entire build is replicated for each variant to get around this problem).  But at all costs avoid the parallel branch approach.

16. A Problem Is Not a Problem Until It's In the CM Repository

This best practice will seem obvious to those who have always relied on good
processes.  A problem is not a problem unless it is defined in the CM repository.  That means that you don't have source code changes to fix problems that have not been identified in your problem tracking system (which should be part of your CM/ALM system).  When you fail to abide by this rule, you get source changes in that have not been authorized.  You get problems fixed that you don't know about and can't tell your
customers about.  Your quality metrics get messed up.  And so forth. 

If you've always relied on your developers to just identify and fix bugs without reporting them, stop it.  Force them to raise a problem report, but make it easy to link that problem report to their change so that they don't have duplicate data entry describing the change.  If you don't know how many problems you've fixed, you'll never have any idea as to how many are still left to fix.  Simply ignore problems (other than true emergencies, and other than getting the source to do the data entry) unless they are in the repository.

15. Use Live Data CRB/CIB Meetings

Use live data at your CRB/CIB meetings and update the information at the meeting.  Today's meeting rooms make this easy to do.  If your tools are not up for the task, time to review and possibly upgrade them.  But the more modern change/problem/feature management tools will support interactive query and update in the context of such meetings.  The result is more timely decision-making, quicker response times and less
administration with fewer errors.

14. Warm-standby Disaster Recovery

Whether it's disk crashes, earthquakes, terrorism, floods, hurricanes, or whatever, you need to be able to continue development and provide ongoing support to your customers.  The way to support this is to ensure that your CM/ALM toolset has full redundancy at an off-site location that can be quickly switched to in case of a disaster.  Whether it serves strictly as a backup site or as a fully redundant node in a multiple site or remote site solution, you cannot afford to put together a solution for your team that will leave them high and dry in the case of a disaster. And it has to work for your entire ALM solution. This may not sound like a CM best practice at first, but it is and for some of you it may need to make the top ten list.

13. Continuous Automation

Ultimately, the quality and responsiveness of your CM/ALM solution will be governed
by the level of automation.  The more advanced the automation, the higher your quality and the lower your costs.  And your processes will be affected significantly.  When automation is in place, there's less need for auditing, there's more time for improving processes, there's more budget for improving your tools, and there's more focus on your
core business.  Once you've automated, step back and look at your gains, but also look at your remaining manual steps.  Can they be further automated?

12. Change Control of Requirements

Change control is not a concept purely for source code.  Requirements need change control as well.  This means that you must be able to group changes to requirements into requirement change packages.  You need ownership of your requirements and this needs to be tied into who is allowed to make changes to them.  Version control is a necessity as is the ability to create baselines.  Promotion levels for your requirements and the ability to view the various configurations.  You likely have fewer people involved in changing requirements, but that doesn't mean you don't need change control.

11. Org Chart Integrated with CM Tool

Your organization chart is a key component of your CM solution.  If you need to identify metrics for different parts of the organization, it sure comes in handy.  If you need to automatically reallocate permissions because of vacations, again the org chart is helpful.  If you want to do reporting by department, or if you want to direct problems to the correct area, the org chart.  If you want to identify what your staff has on their to-do lists, an org-chart based query should let you roll-up a list against all the staff in a particular group.

The Top Ten

OK.  So we've gone through the ones that did not make the top 10.  Not because they're not important.  I'm sure many of use see them as key practices, and so you should.  But let's continue the countdown and identify the more critical, and more basic best practices.

10. Tailor your User Interface Closely to your Process

You've got a CM tool, perhaps an in-house tool, maybe an open source tool, or even a commercial tool.  If you can't customize the user interface to match your process you're in for a lot of headaches.  A couple of "what's this box for" fields on a form will slow down your development team.  Calling a file a module or a workspace a directory will add to
confusion.  You'll either have to put more money into training or put up with the data entry errors the foreign terminology leads to. 

Your process is well understood by your team.  Make sure you take the time to tailor the tool to match that process.  And part of that includes hiding any functionality that is not specific to the user's role.  Some tools have hundreds of buttons, menu items, etc.  If all users have to see all of them, they'll never find what they need.  So hopefully you can customize the tool so that only an hour or two of training is required for each role.  If not, take the time to write a compact role guide which gives the specifics for how to use the tool for all of the actions in a given role.  Include a single page quick reference sheet
that can serve as a reminder until the team members are more familiar with the tool.  And a separate advanced guide/sheet might help those who are more ambitious.

9. Automate Administration to Remove Human Error

Administration can be a costly aspect of your CM solution.  I've heard horror stories,
and I'll bet you have too.  Excess administration is also a road block to process improvement, primarily because of the impact on all of the administration resources.  But the worst part of admin-intensive solutions is that human errors will cause both decreased quality and lower productivity.  If you can automate your administration, do so. 
And keep going until there's no more to automate.  There are solutions out there which claim to require a fraction of a person for CM administration.  And I believe these claims in some cases.  What are they doing right that your solution can't do?

8. Enforce Change Traceability to Features/Problem Reports

If your developers are creating changes without authorization, your product quality will eventually suffer.  Every change needs an authorized directive, whether an approved/assigned problem report, an approved development activity/feature, or a work authorization.  Only then can you direct what's going into your product releases. 

But that's only half of the picture.  If your team's changes are not linked back to the authorized record (i.e feature/activity or problem report), they may be marching to their orders, but you won't know what state the product is in.  And if problems arise, you'll have a difficult time tracing them back to the source.  The best way to force traceability between changes and features/problems, is to force the user to select the problem/feature in order to create the change package (or checkout a file).  The user interface actions should trigger automatic traceability information between the selected problem/feature and the
change package (or file revision in the worst case).

If your tool can't enforce this traceability, your team will have to spend time defining the traceability, either by entering information into different areas of the solution at checkout time, or by audit (and linking action) at the time of code review.  Your code review should be a time to do part of your configuration audit - that is ensuring that the code changes match the request.  But if the request is not traced to the code changes, this will be difficult at best, and your process will not be bullet proof.

7. Main branch per release vs Main Trunk

Do you want to start a good debate in your organization?  Bring up this topic.  Is it better to have a single Main trunk to which all changes are eventually merged or to have one main branch per supported release? 

Well to start with, all changes are never merged to a single Main trunk. With a Main trunk, the team has to decide which release the trunk reflects. Changes not made to that release have to be made somewhere else.  If they're for future releases, they'll eventually get merged to the Main.  Otherwise, not.  Decisions have to be made as to when the Main switches from one release to another.  Directions need to be given for how to deal with changes which are for other releases not on the main trunk, hopefully without delaying a
developer's check-in operation (which would in turn either delay other development or incur unnecessary parallel development and the resulting merging and re-testing). 

A Main-branch-per-release strategy is simple.  You make the change in the branch that the request has targeted.  The way you do that is always the same, regardless of
release or timing.  Some shops (and tools) support making changes directly to the trunk, without any branching - these also support the concept of a change package.  In more advanced tools, changes to older releases can be automatically inherited (optionally) into newer streams (e.g. if the code has not changed between the older and new stream).
The process is simpler and supports stream-based branching strategies (more on that later).

6. Dumb Numbering

Source files have names individually chosen by each developer.  They don't even have to be unique (although you need a unique identifier for every source object).   Not so with problem numbers, requirements, activity codes, test cases, etc.   Naming these would be considered an onerous task.  In a relational world, this is not so much a problem.  But in an engineering database, everything needs a unique id so that it can be referenced (for traceability purposes) and identified for configuration identification purposes. 

Too many want to make the "unique identifier" a way to know the content: system+priority+3 digit number.  The problem happens when the priority changes or the system is split into two. 

All of your unique identifiers suddenly change!  Unique identifiers must be numbered using a dumb numbering system.  For the most part, this means a fixed prefix plus a number (e.g. a.1232 for an activity code).  Some might want to tack on a product prefix, only to find that the same requirement or test case is going to be used in the next product, invalidating the id. Unique ids need to be kept dumb. 

There is room for some prefacing when the preface cannot change.  This occurs when a "parent" record owns the record being identified.  For example, a change record might be numbered sequentially as a sub-record of the developer making the change.  History cannot be changed so this would be an appropriate prefix.  Perhaps documents could be sequentially numbered within their document category (tech note:  TN.33, project status PS.412, etc.) - in this case the document type record is the "owning parent". 

But apart from such ownership sub-record prefaces, putting data into a unique id is just asking for trouble.  In the old days this was a common practice because not everyone had access to the data records which described the object.  That's different today, or at least if
it's not, you have other problems that must be remedied first.

The Top Five

We're down to the final five.  These are not just important, but critical for moving your process forward.  Perhaps you're familiar with all of these already.  If you're not, get familiar with them so that you understand the impact clearly.

5. Continuous Integration with Automated Nightly Builds from the CM Repository

Your development team is continuously submitting changes to your repository.  You can put code reviews in place (highly recommended - in fact this would have made the top five if the topic were a bit broader), have "unit testing" (i.e. change testing) rules, and perhaps
a number of other software engineering guidelines.  But if you don't build the system regularly and test it, you're risking some major delays as unforeseen incompatibilities bring new "builds" to a halt.  If 1,000 software updates have been submitted, it may take a while to narrow down the issues to their caused, or even to isolate the issues in themselves as some may have overlapping symptoms. 

Continuous integration is a process of performing builds very regularly, at least nightly, if not more frequently, and doing basic sanity testing on the build, and a good dose of automated testing too, if you're so equipped to do so.  When problems arise, they can be readily narrowed down to a few to a few dozen software updates.  At this frequency of builds, problems generally arise in one's and two's, rather than by the bushel. 

A second advantage is that the resulting build environment can be used by your development team.  This can be from a simple confidence that they can use the latest built source code for their own testing, to generating shared compile/build environments for use in incremental builds or multiple subsystem testing.  The more timely feedback on the cause of the problem is also helpful as the developer still has the changes clearly in his/her mind and has not fully moved on to something else. 

If you want to build an Agile environment, continuous integration is a necessity. And the CM tools should support it.  Settle for nothing less than fully automated builds without CM manager/Build manager intervention unless problems occur.  If you can't do that, your processes are incomplete or your CM tools and procedures are too complex.

4. Data record Owner and Assignee

Ever hear the expression garbage-in garbage-out?  Why does this happen?  Developers take pride in their work. If they own a file and are responsible for all changes to it, they will treat it with a certain amount of respect and discipline.  If anyone can change any piece of
code at any time, this care is lost.  "So what" if Bob has taken care to keep his code concise and well factored.  I need to make a quick change to fix a problem and I'm going to use a copy and paste method because it's the fastest (really?!).  Bob, on the other hand, would have added a parameter to the existing routine(s) so that the same code
could be used for the purpose, without replicating functionality, bugs, and future code diversion. 

Source code ownership is important.  At the file level, at the branch level. But data ownership goes beyond source code.  Every data record needs someone who is
responsible for it - for it's accuracy, for ensuring it runs through it's process, for assigning tasks related to the record and for tracking such assignments. 

The best practice of data ownership is meant to help ensure data quality, and this helps with process quality.  My recommendation - an "owner" who is responsible for the data and associated process/tasks involved, and an assignee who has been assigned to the next step in moving the data through it's process.  For example, a development manager might have responsibility for a problem report, ensuring it gets successfully resolved.  But
(s)he may assign it to a developer, then to a tester along the path to closing out the problem report.  Every data record needs ownership in the CM world so that there is both accountability and clear direction.

3. Status Flow for all Records with Clear In Box Assignments

A close relative of data ownership, a clear state flow is required for each type of object.  Your tools should clearly show you the state flow for each object, and a status should be tracked against every object.  Each state transition should clearly identify the applicable
roles/permissions for the transition.  It should be possible to specify additional rules, and triggers pertinent to the transition. When you put all of this together, your data along with the processes should be telling the team what needs to be done. 

Based on your role, you should have a number of in-boxes: documents to be reviewed,
problems assigned to you to be fixed, changes that you need to review, other tasks assigned to you.  These should appear in your in-box as a result of status changes and ownership assignment.  In some cases assignment is by name.  In other cases it's more generic, such as by group function.  In the latter case, the in-box is potentially shared
by others.  Some people are better communicators than others.  If you depend on your staff to keep assignments clear, you'll have hits and misses.  If your process and data is doing the job for you, in concert with a decent CM/ALM tool, you'll march along more efficiently, and you'll be able to easily identify your bottlenecks and problem areas.

2. Stream-based Branching Strategy - do not overload branching

Branching provides an effective means for supporting parallel releases.  However, most shops use it for various other purposes:  change promotion, change packaging, build tracking, variant management, short term parallel changes, feature tracking.  I'm sure you can add to the list.  The result is that what should be a straightforward two dimensional "tree" structure of revisions (i.e. release stream and revision within a release), turns into a spaghetti maze of branches.  Elaborate branching strategies are designed and reviewed intensively to ensure that the team can clearly follow it.  Branching, labelling and merging skills are honed to ensure that these can be done effectively. 

But try to ask your CM tool to understand what all the branches mean - well that's more of a problem.  The result is that an inordinate amount of effort goes into these mazes, unravelling them.  And what's worse is that the CM tool can't give the developers guidance.  Instead, the onus is on the developers to learn and follow branching strategies.  So CM becomes tedious rather than automated.  The significant amount of
effort is directed to configuration management, rather than to change management. 

Stream-based branching is the only strategy I've seen that eliminates almost all manual CM.  It's the only one I've seen that eliminates the need to label (manually).  It's the only one I've seen that allows developers to start using the system without training on branching strategies.  It's the only one I've seen that truly minimizes branching.  It's the only one that can automatically tell developers that they need to branch. 

It scales from very small to very large projects extremely well. It does require tools
which deal with change packaging, change promotion, build tracking, etc. without using branches.  I'm sure having this as the #2 best practice will raise a few eyebrows, especially among those who have never tried it, but it so dramatically simplifies CM without compromise, that it is a necessary path for any next generation CM.

The Best of the Best Practices

1. Use of Change Packages

This one is obvious.  Everyone will agree on it, I hope.  It would really be my hope that this is taken as much for granted as is storing file revisions - it shouldn't have to be a best practice - it should be a given. 

But although there are several tools which support change packages, and others which at least go half way and tie file revisions to a common task, sadly, many, if not most, projects today do not use change packages.   They check in files one at a time.  They create delta reports by appending the delta reports of several different files.  They promote each file independently, hopefully with some manner of determining which files all have to be promoted together.  They merge changes by merging each file separately, relying
again on some virtual container (usually in the developer's head) to identify which files are involved. 

Unfortunately, they are supported by industry pseudo-APIs such as the SCC API for IDE
plug-ins.  This is a file-based API, although at least one tool has managed to work in a change-based operation around the API. 

If your developers complain that CM is too much of an overhead on them, you've got to give them an incentive to make their life easier.  Change packages, when properly supported,  in an integrated tool (with integrated repository and user interface) will actually decrease the developer's workload.  Not just in the long run, but on everyday operations such as delta reporting.  The sales job (i.e. selling CM to developers) becomes much easier.

So what is a change package?  It packages all changes, both files and directory structure, together along with the reasons for the change (in the form of traceability references), and the integration data, such as product, target release/stream, change status, unit test comments, etc., into a single record.  Build managers only have to consider 20 changes instead of 75 files for their incremental build.  Team members can stop looking at
revision codes and start looking at changes.  Configuration lineup/baseline algorithms can ensure that all file for a change are included as part of the configuration.  Some tools will even analyze a workspace and build (and possibly submit) a change package based on the differences with respect to the context view (e.g. CM+). 

Change packages make developer's work easy.  It also makes it easier for them to trace back through history looking for specific changes.  Change packages have been around in some tools for a couple of decades.  But in other tools, the "state-of-the-art" is the introduction of new methods and tools which cope with the fact that they don't support
change packages.  If you're not using change packages in your  CM environment, your CM will be viewed as overhead or heavy-handed process.  Your productivity and quality will both be limited.

Perhaps you've had a bad experience using change packages.  If so, the problem was not change packages, it was the technology that was trying to deliver this functionality to you.  Find a vendor that will show you how easy it should be.  Then, use change packages!

Best Practices Summary

So there you have it - the top 10, in fact, the top 20 (or 21).

1. Use of Change Packages

2. Stream-based Branching Strategy - do not overload branching

3. Status flow for all records with Clear In Box Assignments

4. Data record Owner and Assignee

5. Continuous integration with automated nightly builds from the CM repository

6. Dumb numbering

7. Main branch per release vs Main Trunk

8. Enforce change traceability to Features/Problem Reports

9. Automate administration to remove human error

10.Tailor your user interface closely to your process

11. Org chart integrated with CM tool

12. Change control of requirements

13. Continuous Automation

14. Warm-standby disaster recovery

15. Use Live data CRB/CIB meetings

16. A Problem is not a problem until it's in the CM repository

17. Use tags and separate variant code into separate files

18a. Separate Problems/Issues/Defects from Activities/Features/Tasks

18b. Separate customer requests from Engineering problems/features

19. Change promotion vs Promotion Branches

20. Separate products for shared code

Perhaps these best practice descriptions are more detailed than you expected. 
Perhaps they don't quite fit into your picture, or you find them too opinionated.  Maybe you expected one on how best to do labelling or on how to derive a branching strategy - these would, of course, be incompatible with the best practices that I've mentioned here.  Perhaps you have some obvious ones that are missing here -  Give me your feedback.  I'd love to hear your input.   There are many more on the fringes that fall into software engineering, or tool architecture, rather than CM process.  What about the order of importance in which I've presented them?  That's gotta ruffle some feathers.  If not, I'd
love to hear that too.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.