The Lost Art of Change Control

[article]
Summary:
Change control exists to review and approve important modifications, but done wrong, you chance confusion, chaos, failures, and outages. Poorly run change control wastes everyone’s time, but far worse is the missed opportunity to assess and manage risk. Here, Bob Aiello gets you up to speed on the lost art of change control.

Change control exists in most organizations to review and approve important modifications, including infrastructure upgrades and application migrations. However, this responsibility can turn into two-hour meetings to discuss anything from changing the toner cartridge on a production printer to large enterprise systems deployments.

I often felt that being part of the change control board actually required that I complain about the long meetings and what seemed like endless bureaucratic red tape. Many of my colleagues have indicated that they view change control as a complete waste of time, and I have seen much dysfunctional behavior in the change control process, which can have disastrous consequences.

Poorly run change control wastes everyone’s time, but far worse is the missed opportunity to assess and manage risk. In this article we will get you up to speed on the lost art of change control.

Application and infrastructure changes are risky. There are actually seven different types of change control [1], but all too often, change control analysts are narrowly focused on scheduling changes on a calendar. Scheduling conflicting changes on the same day must be avoided. For example, you cannot upgrade a system on the same day that you are expanding the disk drives, because the storage would be unavailable for the team trying to perform the upgrade on the application. But the calendar is only the first step.

Gatekeeping change control refers to reviewing proposed changes to a production or controlled user acceptance test environment. These changes need to be reviewed for technical risk, but many organizations don’t do a sufficient job of assessing and managing that risk. This failure is why many systems upgrades lead to outages and widespread services interruptions. Where did change control go wrong?

One common problem is that many organizations fail to differentiate between the change control board (CCB) and the change advisory board (CAB). The CBB is responsible for managing the process, ensuring traceability and enforcing tollgates such as approvals are met. The process is very important, but so is having smart technical experts review the proposed changes to identify and mitigate technical risk. We have seen organizations rename their CCB to CAB, but then they don’t have an advisory board of experts who truly understand the downstream impact of a change.

Many systems today have interdependencies, and if you are going to change an interface, you need to include the owner of each asset that might be impacted. Failing to review and address technical risk leads to a plan to manage issues only should they arise, which is not acceptable. Further, there is often a lack of integration between the incident and problem management teams with the change control functions, which means that the critical incident response team may not even be aware that a change was recently made. Even if they are aware, they may lack the technical details and expertise to fully understand the nature of the change that led to an outage.

Although the experts are on the payroll, they may not be involved in an advisory role to help identify and address challenges that occur. In missile and defense organizations, there is usually an additional change control step to review proposed changes before they are approved and funded. Some businesses would do well to adopt a review step, often called a priori change control, even if they are not working with safety-critical systems.

The much bigger issue is that we see change control functions lump all their changes into the same meeting, resulting in too much noise and basically defeating the whole purpose of effective change control. The first thing you want to do is get all the routine changes out of the change control meeting. Get them reviewed separately and, where appropriate, make them preapproved (which the ITIL v3 framework calls standard changes).

It is also a good idea to organize changes by their technical characteristics. Changes to a firewall or other security concern should be in a specialized change control meeting with a summary given back to the main CCB. Similarly, configuration changes can sometimes be handled separately. When you dump hundreds of unrelated changes into the same two-hour meeting, your change control process is effectively rendered useless.

You might decide to hold separate meetings, or just organize the meeting agenda by the type of change. Set entry criteria to have a clear description of all changes delivered to attendees for their review before the meeting starts. I was once able to take a dysfunctional weekly two-hour meeting down to two short half-hour meetings that were quick and to the point by making everyone circulate a clear description of the proposed changes to all stakeholders beforehand.

Your IT processes, including change control, should be managed by the software engineering process group, which is effectively a change control function for the entire software development process.

Change control meetings should not be a long, boring waste of time. Effective change control can significantly help improve your entire software development process. Take out the noise and add the right amount of structure, and your change control function will help you avoid costly mistakes, improve quality, and enhance productivity for your entire team.

 

[1] Aiello, Bob and Leslie Sachs. 2010. Configuration Management Best Practices: Practical Methods that Work in the Real World. Addison-Wesley Professional.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.