Seven Lessons You Learn When Growing Your Configuration Management

[article]
Summary:
When the number of employees, products, and releases you’re managing grows rapidly, that transformation introduces several challenges—and opportunities—in almost every aspect of configuration management. This article presents the major issues a company may face and the improvements you can make to processes and tools as a result.

In 2004, I was appointed the configuration management leader in the company I still work for today. This was an exciting assignment for me, and in this article, I will share some of my experiences coming up to speed as a CM guru.

We were a relatively small company—about forty research and development members, with two products and long release cycles—and our CM was pretty basic: a popular source-control software, an old and rarely used bug-tracking tool, and some semicryptic shell-based build scripts. A team of two CM engineers was more than enough to handle everything.

A year went by. The company was doing well and recruited more employees slowly but steadily. At first, my team just needed to make sure we had enough software licenses to cater to new recruits, and luckily, management didn’t need the demand for a larger CM budget explained. I can’t stress this first lesson enough:

1. Managers need to be aware of the CM-related costs of recruiting more developers.

The first challenge we faced as a CM team was the bug-tracking tool. It was no longer supported, and we couldn’t purchase more licenses for it. Coupled with the fact that it was cumbersome to use and almost impossible to customize, there was no choice but to quickly look for a replacement. We went for the “natural” choice of using an advanced defect-tracking tool offered by our source-control tool vendor. It was a costly tool, and it required updated hardware, which increased the cost even further. But we had the budget, and this was a real game-changer for us. Suddenly it was easy to report and track bugs, run various reports, and customize both the data and the behavior.

Here we learned another lesson:

2. Tool customization must also be controlled.

Many users came to us with a lot of requests and suggestions—sometimes conflicting ones—and sometimes they had undesired side effects. For example, one team leader asked to make a certain field mandatory, which annoyed the other teams who didn’t need that field at all. We eventually established a “triage” that must authorize customization requests to the CM tools before my team implements them.

A couple of years later we were purchased by a large corporation, and then things started to get really interesting. We began developing new products, as well as adding large-scale features to the existing ones. More R&D staff was recruited, and at an increasing rate—in some cases, the number of employees grew by 10 percent in a single month.

As a small CM team, we had a lot to deal with as a result of this rapid growth. The three major issues we needed to constantly review (and revise as needed) were tool scalability, processes, and costs.

Scalability of most tools is pretty straightforward: Having more users means more data is generated, and more frequently. Hence, you need to make sure you can:

3. Support the repository growth, support multiple connections at once, and provide an adequate level of performance.

It comes down to getting better hardware and making sure the tool is up for the task.

In our case, the source-control and bug-tracking tools were designed for the enterprise, so we concentrated on upgrading the servers’ hardware. Instead of just gradually upgrading the hardware every time, we decided to do a one-time purchase of a couple of strong servers that were expected to last us for at least five years.

Looking back, I think it was the right approach; we no longer needed to regularly clean up the storage or investigate performance issues, which left us time for other tasks. As a side note, even if a certain product has inherent performance problems, in most cases strong enough hardware can mitigate it (and, unfortunately, some vendors are exploiting it). The bottom line is, if you have the budget, get the strongest servers and fastest storage you can afford, even if it seems like overkill. It will pay off in the long run.

Scalability of builds, however, was another matter. Having more products means more build configurations to support and more builds to run (and keep). Also, having more code usually means builds are longer. On the other hand, users are asking for more frequent builds in order to get faster feedback on their changes. Hence, the challenge is threefold: hardware (more build servers), software (a robust, user-friendly, and automated build system), and build performance.

Hardware-wise, we figured out we would need about ten new build servers, and this is where virtualization can help. Instead of purchasing and maintaining ten physical servers, we bought a single, extremely strong server with a lot of fast storage and deployed ten identical virtual machines to function as build servers. Besides saving on hardware in the long run, this also provides us with an easy way to manage the build environments: by using “snapshots” to regularly capture the servers’ images. So, for example, if a certain software update—or even an uninformed user—causes a problem in the build environment, we can quickly remedy it by reverting to the most recent snapshot. This gives us our next lesson:

4. It’s very important to keep the build environment controlled, especially when you have a lot of build servers.

Stronger hardware isn’t enough to drastically improve build times. But with multiple strong servers, we could perform parallel and distributed builds, either using compiler capabilities or with third-party tools). This is a continuous effort; even today we are still looking for ways to improve the build time, such as by reducing code dependencies to increase parallelism.

Software-wise, we threw away the old and unreadable build scripts and wrote a graphical build application that is easy to configure and runs scheduled builds without user intervention. We trained several developers so they could use the system to run all kinds of builds without needing the CM team. At the time, it was enough. Today, we realize it’s a limited system; it is missing features that will be expensive to develop, and we spend too much time maintaining it. We are now opting for a third-party continuous integration tool instead.

I think investing the time in a house-built tool was a mistake in the long run. My conclusion here:

5. Use a well-established third-party build tool, and start doing it when you’re still small.

Don’t be tempted to write your own tools. Eventually you will outgrow them and will just need to repeat all that work with athird-party tool.

Our development processes have changed dramatically as well over the years. We recruited different people with different backgrounds and experience, so we had to make our processes more and more rigid in order to make sure everyone does their work in the right context—this also supports the need for a more frequent, shorter build.

Another notable change is the branching strategy. When we were small, almost all developers committed directly to the active mainline. With the introduction of nightly builds, it became apparent that we need better control on those commits, otherwise the builds kept breaking. Also, the more developers you have, the more chance there is that some undesired change would slip into the build, which may not be discovered until after the product was released. Today, developers work in private branches, each team has an integration branch, and only team leaders—after reviewing the branch content—merge it to the mainline.

A completely new requirement that emerged was the need for traceability between change items (bugs and tasks), code, and builds. When we were a small company, we only tracked critical bugs, each developer kept his or her own record about which changes they made where and when, and builds were run once or twice a month. Now, we have more than a hundred developers, each handling several bugs and tasks per week and multiple builds of all products a day. We can no longer count on the developer’s goodwill to know when a certain feature was delivered or which build contains a certain bug fix. Therefore, I dare to say:

6. Traceability is inevitable when the number of developers passes a certain point.

We had to come up with ways to integrate the source control, issue tracker, and build system. Today, each commit is connected to an issue, and the build system updates the relevant issues with the related build information. We’re also looking to integrate further with other data, such as requirements and tests.

Lastly, a word about software costs: Many commercial tools become ridiculously expensive when you have a large number of users. I’m not just talking about license cost, but also administration, hardware requirements, and other hidden costs. At some point we decided to replace our issue-tracking tool, simply because the cost of the licenses and maintenance far exceeded other tools. After some review, we found a tool that was actually better in most aspects and cost thirty times less than the one we used! The lesson here:

7. Continuously evaluate the cost of tools you use compared to what the market has to offer.

We are now in the process of evaluating the need for replacing our expensive and somewhat outdated source-control tool as well; after all, that market has advanced a lot in the past few years.

Being in charge of CM in a growing company is a challenging and interesting experience. The most important thing to remember is to plan ahead. If you know what to expect and what pitfalls to avoid, you’ll have a better chance of making it work.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.