Don't Believe Everything You Read!

There are volumes of written material covering just about every aspect of software engineering. Books, articles, magazines, conference proceedings, Web sites, and other rich sources of information are readily available to those learning about our profession. However, based on personal experience and observation, Ed Weller is compelled to ask how much of this information is actually misinformation. Anytime you collect data you must proceed with caution! In this article, we'll find out why Ed questions validity and accuracy and what you can do next time you're faced with questionable material.

The Misstatements and Misinformation
Many years ago, IEEE Software magazine published "Lessons from Three Years of Inspection Data," in which I described the results of the inspection program at Bull HN Information Systems using four case studies, which were more like experience reports. I ran across a reference to this article recently in a highly respected information source (anonymous to protect the publication). In it, they made the following statements paraphrasing the article:

"6,000 inspections..." The actual number of inspections was 7,413.

"They estimate that code inspections before unit test found 80 percent of the defects." The article reported the results of two projects--one project found 80 percent of the product defects prior to system testing and another project found 98.7 percent of defects by inspection after a year of "unit test, integration test, and continued use of the product by other projects."

"Inspections after system test found 70 percent of the defects." An ambiguous statement that could mean inspections were conducted after system test (which is a rather absurd use of the inspection process), or that the inspection effectiveness would drop after system test was concluded. The article said the effectiveness would "undoubtedly drop after system test," but mentioned no numbers. In fact this statement was in the fourth case study, yet was applied to the paraphrase of the first case study.

"They have concluded inspections can replace unit testing." I stated that the findings supported suggestions by Frank Ackerman and his colleagues that inspections could replace unit testing, but specifically stated we ran the unit test cases in integration test "to ensure that inspections did not miss categories of defects that are difficult to detect by inspection."

There were four errors in three sentences that significantly changed the information conveyed in the original article. Since this was a paraphrase, and not a direct quotation, there is no way the casual reader would be aware of the rephrasing. The errors are obviously unintentional, and I suspect the author's interpretation of the original article in the context of how they execute inspections and testing is the source of the errors. One of the difficulties in writing articles is the limited space allocated by editors, which precludes providing all of the desired background information that would allow the reader to understand all of the factors needed to accurately translate the information in the original article into their organization or secondary article.

In another case, data from the IEEE Software article was used to calculate the ROI of inspections in a white paper on a Web site. The end result was that inspections saved several times the total development budget of the organization. I never could determine how these calculations were done, but at my request they were deleted from the presentation and removed from the Web site.

Incomplete Information
One of the more interesting cases of incomplete information is the statement that the chance of an error in a one-line code change is 50 percent. This number is given without any reference to where it is measured--initial code, after unit testing, after system testing, or in use. To be fair, the source for this suggests this number is the initial error rate without any inspection or testing, but the article is not explicit. If you are trying to measure maintenance quality or convince your organization to change its maintenance process, knowing exactly where in the lifecycle this number was measured is important.

I have measured this number in a variety of organizations, starting with the defect rate of the fixes shipped to the customer, with rates varying from 0 percent to as high as 30 to 40 percent. I have seen defect rates in inspections of one-line fixes vary from 10 percent to 30 to 50 percent. Testing usually runs in the 20 to 40 percent range, but this number was highly dependent on the quality of inspections prior to testing.

I ran across a case where the data in a conference presentation looked funny. When I questioned the author, he admitted mixing major and minor defects on one chart. His reason was "it made a better story," and admitted the mixing of data was incorrect.

Let the Reader Beware
So what is the poor reader supposed to do? We are faced with incorrect paraphrasing, incomplete explanations, and--on occasion--deliberate errors of representation. I use the following methods to help me determine if information is of high quality.

First, look for articles that are peer-reviewed or posted on moderated Web sites. Peer-reviewed journals will typically have fewer technical errors. Check the author's acknowledgements to see if they thank reviewers. This increases the chance that errors have been removed. For Web sites, look for publications that encourage feedback or forums encouraging discussion ( is an example). Attend conferences where presentations are reviewed for accuracy (sponsors, review committees, or program chairs that are respected for program content).

Second, beware of hype and exaggerated or out-of-context claims. If a case study or experience report is generalized into a larger context, be skeptical. If the reported cost savings is equal to or greater than the original project cost, something is amiss! Look for claims of savings or ROI that apply only to a part of the development or test cycle that is inadvertently generalized over the full project. One study of the schedule improvement accredited to software process improvement claimed a 95 percent improvement. (A twenty-week task would take one week.) I find this a bit hard to believe if applied to the total project schedule. On the other hand, if applied to a task within the project, I can see where a system build, or final test stage activities can be improved this much by eliminating rework due to defects.

Third, try to find multiple sources of information to allow you to compare and contrast numbers and results. If there are wide disparities in the results, proceed with caution. When an article is paraphrased and there is a reference, check the reference to see if your interpretation is the same as the author's.

Fourth, look to the underlying principles that are being discussed. This is where the real value lies in most of what we read. Identify the method or process being used, and see if it can be applied to your organization. Consider running a pilot project to verify that similar results are possible in your organization.

There is a lot of information available; unfortunately not all of it is accurate. We must objectively consider the information we gather. With a little practice, we'll soon be able to separate the good apples from the bad.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.