As has been brought up several times over the past several
years, metrics produced from the various CM repositories are a great aide in
determining the relative quality of a product or release, as well as allowing
data mining for root causal analysis. This article is intended to explore some
specific metrics within the software development arena, and how they can help
other's.
First, let's define the CM repositories we will be using:
- Version
Control (VC)
- Defect,
Issue and Enhancement Tracking (DIET)
- Change
Control - if different from DIET (CC)
VC contains data on every change to every element in the
Product. The definition of what constitutes an element and what data is
maintained varies between VC tools, so the table below gives some of the more
common types:
Element Type
|
Discussion
|
Text Files
|
Text Files are considered elements by every software VC
tool; however some of the older mainframe tools are EBSIDIC rather than ASCII
in nature. Even worse, some are 7-bit ASCII based while others are 8-bit
ASCII and relatively recently the addition of UNICODE and other multilingual
character sets are causing even the concept of "text" to change.
For the purpose of this article, "text" files will be
those that are normally presented to a human for editing in a coding environment.
|
Binary Files
|
Binary Files are considered elements by most software VC tools, however how
these files are handles differs widely. Since we are more concerned with the
data associated with these files and not their contents, we can ignore the
differencing and merge problems associated with them.
|
Directories
|
Directories are considered elements only by the more
recent VC tools, where recent is a relative term. When available, and when
the VC tool is able to track file moves, additional metrics are possible.
|
Email, Facsimile and other non-developmental element types
|
These types are rarely controlled by VC tools, though
there is no good reason they are excluded. Typical uses are for requirements
and change requests.
For the purposes of this article, these elements will be
ignored.
|
The VC repository needs to be able to track the following
data:
- Element
creation date
- Element
last-modified time at time of creation
- Element
privileges (think *NIX) at time of creation
- Element
creation name (optional: location)
- Element
creation author
- Revision
creation date
- Revision
last-modified time at time of check-in
- Revision
privileges (think *NIX) at time of check-in
- Revision
element name (optional: location)
- Revision
creation author
- "Branch"
revision is on at time of check-in
DIET contains, or should contain, the following data at a
minimum:
- Type
of ticket (Defect, Issue, Enhancement, RFI, etc.)
- Date
ticket created
- Who
created ticket
- Who
ticket was created for (Customer, other user, etc.)
- Occurrence
detection location (product, release, patch, etc.)
- Occurrence
resolution location (product, release, patch, etc.)
- Date
ticket resolved
- State/Status
transitions for ticket (E.g. Open, Assigned, Tested, Resolved, Released)
CC or the DIET implementation of CC) contains the following
data at a minimum:
- DIET
ticket id
- Approval
to proceed (E.g. time stamped electronic signature)
- Approval
to release (E.g. time stamped electronic signature)
- Associated
elements changed in order to resolve request
- Assignment
to causal analysis
This is a lot of data. CM by itself only cares that the data
is collected and is accurate; the content is the responsibility of others. Let's
take a look at some of the metrics this data can be used for. Note that when
periods of time are referenced it is not uncommon to do the analysis daily,
weekly and monthly. It is also not
uncommon to use a rolling timeframe instead of a fixed one.
- Version
Control only
- Number
of files changed per unit time - depending on the phase of development,
this metric can be trended to determine if the rate of change is slowing
down. If the number of files changed per working day stays constant or
increases, then the Release Candidate (RC) being worked toward is nowhere near ready for Functional
Testing, much less release.
- Number
of lines of code added/changed/removed per unit time - this is a finer-grained
analysis of the above. It allows a similar trend to be created that is
more aligned with the actual scope of the changes made and not just the
number of files. The primary reason this metric is not as common as the
one above is that it is more time consuming to create and more easily
misunderstood by "outsiders." It also qualitatively yields the same
result.
- Number
of lines of code added/changed/removed per file change - this metric
gives a relative measure of how drastic changes tend to be. The more
lines of code changes, the more chance there is to introduce new defects
into the code at the same time. It also can give an indication of the
complexity of the codebase if the average number is high versus only a
few files "spiking the graph."
- Number
of files changed per RC build - this implies that other trends indicate a
release is, or should be, possible from the codebase being built. There
are two main reasons for multiple RC builds: (1) The build itself fails
and must be corrected, and (2) QC has rejected the previous build as
containing unacceptable defects, including requirements that are not met.
The number of files changed per RC build of the second type can be
trended to give an indication that the overall "quality" of the RC is
improving or not.
- Number
of lines of code added/changed/removed per RC build - this is a finer-grained
analysis of the above. It allows a similar trend to be created that is
more aligned with the actual scope of the changes made between RC builds
and not just the number of files. The primary reason this metric is not
as common as the one above is that it is more time consuming to create
and more easily misunderstood by "outsiders." It also qualitatively
yields the same result. It is
often used by QA to look for patterns that indicate possible causes of
defect insertion; in other words, "stop making changes like this..."
- Number
of files changed per person (generally per some other unit) - this is a
personal productivity metric that can be easily misused if published. The
intent is for this metric to be trended (it should average out under
normal conditions), but it needs to be used in conjunction with the complexity
of the code being modified; the more complex the element modified the
more likely the change rate is to be lower.
- Number
of lines of code changed per person (generally per some other unit) -
this is also a finer-grained personal productivity metric with all of the
caveats of the previous metric.
- DIET
only
- Number
of defects opened per unit time -As the number of new defects per unit
time decrease, it is assumed that there are fewer defects to find and
that their occurrence in a production environment becomes increasingly
unlikely. As a cumulative metric, this is generally trended with the
metric below to show that as the areas under the two curves approach each
other, the overall "quality" of the RC is increasing to the point where
the number of defects remaining may be allowed to pass to the next
release.
- Number
of defects resolved/closed per unit time - As the number of resolved
defects per unit time decreases, it cannot be assumed that there are
fewer defects left to resolve. This metric must be used in conjunction with either the previous metric
or the following one.
- Number
of defects worked per unit time per user - this is an indication of how
many and/or how complex the defects are. If the number goes down, the two
most common causes are lessening number of defects available to be worked
or the increasing complexity of the defects being worked. This metric is
normally used as a modifier or "explainer" for the two metrics above when
they are combined.
- Number
of times a defect reappears after being resolved - this is indicative of
two things: (1) Poor CM practices by developers and/or (2) A broken merge
process. This is one of the few metrics that directly relates to how well
CM is doing their job.
- Max/min/avg
duration a defect is open - this set of metrics, especially when trended
over time, yields estimates that can be used for planning purposes. Of
course, it is also desirable to work towards reducing the numbers, but
there is a practical limit to how much can be done.
- Both
- Number
of files changed per defect - Also known as "Volatility," this is an
indication of how pathologically connected the codebase is and how
fragile the code is. Remember, the greater the amount of code changed,
the greater the chance of defect insertion. If the trend is widespread or
if the same files keep being changed, an Architectural review should be
considered.
- Number
of lines of code changed per defect - this is a finer-grained analysis
similar to the above.
- Defects
per 1000 lines of code - Also known as "Defect Density," this is an
overall measure of product quality. The goal is to reduce this number as
rapidly as possible, so a trend over time should indicate success or
failure of these efforts.
- Defects
introduced per 100 lines of code changed - Also known as "Defect Insertion
Rate," this is a measure of how well or how poorly developers pay
attention to their repair work. A trend over time should show whether
this is a problem that should be addressed by Management or not. Note
that it is often difficult to accureately measure this since a new defect
detected simply be an existing one only recently uncovered by ongoing
development and/or repair.
- Number
of defects per file per unit time - Also known as "Fragility," this is an
indication of a file that should be considered for redesign,
reimplementation or refactoring. The chances are that a file that scores
a high number for this metric will become stabile in the short run, but
as the maintenance and enhancement phase kicks in will once again return
to prominence.
- Change
Control used in conjunction with DIET
- Max/min/avg
time from Approval to Proceed to start of work - this gives a set of
metrics that can be used for planning purposes, but for very little else.
- Number
of Approvals to Proceed per Approver - this gives a metric that can be
watched for abusers and/or slackers within the CC process. What are being
looked for are process violations, not finger pointing.
- Max/min/avg
time from DIET Resolved status to Approval to Release - this gives a set
of metrics that can be used for planning purposes, but for very little
else.
- Assignments
to causal analysis - this provides an indication of the desire to
recognize general classes of problems. Since causal analysis is costly in
terms of both time and human resources, only a small number of analyses
can be underway at any one time so it is imperative that someone be able
to recognize related requests and lump them together.
Okay, we have seen the data that is normally available to
CM, and we have seen some metrics and how they can be produced, trended and
used, but how does this constitute a “Method of Quality Testing?” Quality is an
attribute of the product. Testing is a method of determining the relative
“goodness” of that attribute in the sense that it is either a direct or
indirect measure of the product’s compliance to a set of requirements.
Compliance to Functional and Business requirements is best left up to the QC
group and their active testing. Things that can adversely affect that compliance include:
- An
incomplete implementation of one or more Functional and/or Business
requirements
- A
misunderstanding of one or more of Functional and/or Business requirements
- Interface
inconsistencies within the Product itself, or with other external Products
- Logic
inconsistencies within the Product itself
What the CM-generated metrics allow is for QA to participate in developing ways of
minimizing the number of defects attributable each of these contributing
factors, allow both QA and Development to detect and address “bad”
coding patterns and overly complex elements, and allow Management ways to determine “relative compliance” (also known as,
“When will this be ‘good enough’ that I can release this?”). From these
perspectives, CM is providing more analysis than test, but the use of the
resulting metric trends serve the same purpose – especially when expressed as
control graphs.
Collect data, compute the metrics and perform the analyses
that make sense within your corporate culture. Don’t do things that have no
Return on Investment (ROI) or that will not be used by downstream groups. If
your repositories are sound, you can always mine for historical data if you
decide you need to expand your portfolio. Good luck; have a happy and safe
holiday season.
Ben Weatherall is currently based in Fort Worth,
Texas where he practices Practical CM on a daily basis using a
combination of CVS and custom tools to support a modified Agile-SCRUM
development methodology. He is a member of IEEE, ASEE (Association of
Software Engineering Excellence – The SEI’s Dallas based SPIN
Affiliate), NTLUG (North Texas Linux Users Group), and PLUG (Phoenix
Linux Users Group).
Trackback(0)
Comments 
Write comment
 |