Test Coverage in the Age of Continuous Delivery

[article]
Summary:
Test coverage is a strategy to help us spend scarce testing time on the right priorities. When things were tested last, how much automation coverage we have, how often the customers use the feature, and how critical the feature is to application are all factors to consider. Here are some ideas for keeping quality high when you're transitioning to continuous delivery.

In the bad old days, we would have a testing phase that could last for weeks or months. We’d start out just testing and finding bugs, but eventually, we’d start to get a build solid enough to consider for release.

The testers would swarm on the candidate, and we would never have enough time to run all our test ideas against the software. Even if we did, we wanted to test a balance of uses in order to make sure all the features—or core use cases, or components, or requirements—were “covered” by tests for this build.

The idea of coverage was born.

Twenty years later, most of the teams I work with no longer have a “testing phase.” If they do, it is a half-day or a day, maybe a week at most. Some larger enterprises have hardening sprints to test integrating components from multiple teams, but they tend to view this as a transitional activity, not an end state.

Testing is also a lot more complex than it used to be. We have unit test coverage, integration test coverage, automated test coverage, and, yes, actual human exploration and investigation coverage.

One top of that we have a third dimension: time. Most software organizations I work with have at least a daily build, if not a continual build. Testing a week for a release candidate rarely works, as people are busy committing fixes, often on the master branch—the same place the build candidate is pulled from. With continuous deployment to staging, the very staging server we are testing is changing in real time.

Using continuous delivery to production, each fix rolls out to production separately—there is no “wait and test everything” moment.

Changing What “Deploy” Means

When Windows programs shipped, they used to actually ship, in a box or on a CD. We would collect the current versions of all the files and deploy them as a bunch.

The web changed all that; all of a sudden, we could push just one single web page, and perhaps a few images, to production at a time. If the web page was isolated and the only risk was that it would go wrong, we didn’t need to retest the entire application.

Some of the best known cases of early continuous delivery were really just pushing static PHP files individually or in small groups. As long as the code did not change a code library or database, and the programmer could roll back a mistake easier, there was suddenly no need for a long, involved regression test process.

Microservices offer us a similar benefit. As long as the services are isolated and we know where the main application calls the service, then we can test the service in staging, where it interacts with the user interface, and roll it out—without a large shakedown of the entire application.

The Transition to Continuous Delivery

Many of the teams I work with are trying to move to microservices, but things just aren’t that simple. They don’t have the technologies in place to do push-button, isolated deploys. If they do, then they certainly don’t have easy rollback.

Rollback usually consists of making a manual change back and deploying forward. It requires quite a bit of infrastructure to isolate a change and roll forward without rolling out everything else committed lately. One company I worked with had this problem, and testers would just comment out all the changes made since the last push.

Don’t do that.

Meanwhile, the idea of coverage has been lost. We pretend we live in this perfect world of isolated services, but failure demand is still very high. A fix to one feature or component easily leaks to other features. Until these “ripple changes” are eliminated, continuous delivery will just mean rolling out a bunch of broken code quickly.

The bottom line: Teams need a game plan for how to test as they transition to continuous delivery.

Tracking Risks and Features

What I see today is that teams have a list of all the automated test ideas and run them right before deploy—all the Selenium tests, all the unit tests, and so on.

The problem with that is all the ideas that were too much work to automate, such as tab order, printing, resizing of the browser—those are ignored. Perhaps they are tested once for each new story, then forgotten. And, of course, it is the forgotten things that end up biting us.

In the team’s final burndown process, you could use a low-tech testing dashboard based on the features of the product, assigning each feature a score from one to five (or frowny face to smiley face) on how well they are tested. For the next release, when deciding whom to assign to what, take a look at the previous release and cover the things that were touched for this release, critical to the product, or just not covered well.

You could also write emergent risks on sticky notes and put them on the wall, sorted by priority. Anyone can add anything they want to the wall, which has the most serious problems at the bottom. Each day, every member of the team pulls at least one of those risks off the wall, tests for it, and then moves the note somewhere else and dates it. Eventually you add those cards back to the top of the to-be-tested stack. This strategy even works for continuous delivery—just pull cards all the time. You might even start testing in production!

Focus on Priorities

Coverage is a strategy to help us spend scarce testing time on the right priorities. When things were tested last, how much automation coverage we have, how often the customers use the feature, and how critical the feature is to application are all factors to consider.

Instead of giving you some psuedoscientific formula that pops out how well to test each piece, I tried to give a couple of ideas for ways to visualize and tackle the problem in ways that create shared understanding.

Some teams can ignore classic ideas of coverage and can deploy small components independently. For everyone else: We had better get to work.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.