Flickering Test Failures and End-to-End Tests

[article]
Summary:

In "Growing Object Oriented Software Guided By Tests", Steve Freeman and Nat Pryce talk about the dangers of tests that occasionally fail, otherwise known as flickering tests. These failures can cause teams to start seeing these failures as false positives, and distrust their build results. I know - it's happened to me, especially with end-to-end Selenium tests.

In "Growing Object Oriented Software Guided By Tests", Steve Freeman and Nat Pryce talk about the dangers of tests that occasionally fail, otherwise known as flickering tests. These failures can cause teams to start seeing these failures as false positives, and distrust their build results. I know - it's happened to me, especially with end-to-end Selenium tests.

We experienced some test flickering in our Selenium builds, and guess what: they were real bugs. Obscure, tricky to find bugs that happened only occasionally. But the Selenium tests ran dozens of tests after every commit, which meant the failure turned up frequently in the build. The benefit was in frequency of using the system.

For me, most of these flickering failures have happened in end-to-end tests. These can be the trickiest to get right, as they have to coordinate interaction between many components of a system - often in an asynchronous manner.

There has been a lot of discussion about this topic recently.

    • Freeman and Pryce address this in their book, with some strategies and tools for dealing with asynchronous interaction. If you haven't read "Growing Object-Oriented Software Guided By Tests" (you'll see this book referenced on Twitter via the acronym homophone #goos), I highly recommend it. The authors know a lot about good design, and you'll experience it first-hand as they build a working example from start to finish.

How have you dealt with flickering test failures? Were they end-to-end tests?

User Comments

1 comment
Anonymous's picture
Anonymous

I have had my fair share of flickering test failures. Long before I get to the point that my automation is finding flickering test failures, I have established that I am not an idiot and if I say there is a problem, there is a problem. Basically, I and everyone who works for me has to understand the importance of establishing trust and not reporting false negatives.<br><br>What happens after that, when a flickering test failure occurs, is usually a discussion of not whether the failure exists but how can we gather enough information to know what the issue really is. We start with the understanding that looking at the problem will change it. For example, running the test case while the application is running from within a debugger will dramatically change the timing and the problem often disappears.<br><br>The general attitude is to put something in the production code (e.g. better logging) which will give us an idea what was happening leading up to the failure. We then forget about it and keep testing and fixing other issues. Every time the flickering test failure occurs we look at the extended information to see if anything makes sense. If not, add some more monitoring code.<br><br>This has proven to be quite successful for me. Just trying to guess at what the problem is and where to put additional logging information gets everyone thinking about how the code works and what sort of assumptions were made that might not necessarily be true.<br>

January 18, 2010 - 7:08pm

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.