Metrics are, for all practical purposes, objective
measurements of something. This article will define several classes of metrics
and describe whether they are normally planned or "mined" from existing CM
repositories.
As far as Configuration Management goes, there are basically
three kinds of metrics: Control, Process and Analysis. Control metrics are things like Defect Density & Open/Close
rates. Their purpose is to identify when physical processes are out of control
or when they reenter control. Process
metrics are those intended to determine if processes are being followed or if
improvements are really improvements. Analysis
metrics are those that allow the determination of unforeseen trends or root
causes.
Some metrics, like Process Metrics, need to be carefully
planned so their collection is as painless as possible and so they cannot be
"misused," while others are better being "mined" rather
than "collected." Let's take each of these three categories in turn:
- Control Metrics - These metrics
allow you to determine state information about a product. Since these
metrics are generated and used in real time, their collection should be
both planned and automated, though the historical data used to establish
baselines and limits is often mined. There should be a limited number of
these implemented, though new ones may be added and older ones retired as
a product matures or the corporate culture changes. As an example, three
of these states might be:
- The
relative quality of the product based on data derived from defect
records. This is where the classic Defect Density (ex: Defects per 1000
lines of code) and Open vs. Close Rates (ex: a plot of the number of
defects opened per day vs. the number of defects closed on the same day)
come into play. Each of these can be sliced (finer granularity) and diced
(addition of a third or more axis) to give additional insight into the
overall product quality at any point in time.
- The
relative codebase quality is based on data derived from both version
control and defect records. This is where metrics such as Fragility Index
(the number of different defects that were either fully or partially
resolved by changes to specific files or subsystems) come into play. This
is also where pure codebase metrics such as SLOC changes, cylclometric
complexity and number of function points per module are used.
- The
"releasability" of a product at any point in time is based on data
derived from defect records. Metrics such as "Number of Critical and High
Severity Bugs > 0" and "Open/Close Rate > .75" are two typical
examples used to decide if the overall quality of a release is "good
enough" to release.
- Process Metrics - These metrics
allow you to determine state information about current processes. When you
hear, "Plan your metrics so you can maximize your return on your
investment," what is really being said is that, since processes are for
the most part organizational in nature, the data that need to be collected
will have a high visibility and a corresponding high cost of acquisition.
It is not uncommon to have the time worked on each defect captured and it
can be required that each person who even looks at a defect log the time, phase (detection/reporting,
analysis, triage (CRB/CCB), repair and test) and comments.
There are many other types of data that can be collected and used for
Process Metrics, but the majority all devolve to being used for
time-and-motion or efficiency reports. A few of these other metrics are:
- Number
of Waivers per Release
- Rate
of Process Being Followed (using data collected at each exit and entrance
gate to indicate if all of the prerequisites were in place or not)
- Rate
of Needed vs. Superfluous Release Candidate Builds (using data collected
after every release candidate
build to determine after the fact if it was necessary and if not, could
that have been determined prior to the build)
- Analysis Metrics - These metrics
allow you to analyze trends, patterns or root causes. In other words, this
kind of metric is used to discover new things from existing data. In
almost all cases, the data used for these analyses is mined, not planned. There
is no restriction on the type of data to be used so long as its collection
has been "reasonably" consistent. I stress "reasonably" only because some
are of the opinion that unless data is collected the same way and with the
same level of attention to detail that it cannot be used for analysis
purposes. This is not true, though it is obviously easier to do so if the data being used is truly consistent.
These kinds of metrics make up the majority of the effort expended by the
CM team since analysis tends to be a continuous series of one-of-a-kind
tasks. It is not uncommon; however, that the end result of an analysis is
the addition of one or more new control metrics. Some of these metrics
might include such things as:
- Are
code merges introducing additional defects?
- Are
code merges being performed successfully?
- Are
associated database changes being propagated with merged code?
- Are
there coding patterns that can be correlated to classes of defects?
- Are
there trends indicating pre-implementation phases are causing excessive
rework?
- Are
there detectible reasons why one subsystem is more prone to defects than
others?
- Do
enhancement implementations cause excessive numbers of new defects to be
posted? Especially outside the area(s) normally associated with the new
enhancement?
- Are
there individuals that tend to submit faulty code on the "first try?"
- Is
there a way to determine the typical maximum number of defects that can
be submitted per day per tester? Does the number vary significantly by
tester? By area under test?
- What
is the typical number of defects submitted per day (per tester, per area)
for each phase of testing?
- What
is the average number of defects that require rework? On the average, how
many times?
As you can tell, I like
metrics. Some of the data behind them need to be planned for and specifically
collected, while others can simply be mined from the tools already being used.
There are some things you can do to make things better for future mining
efforts, though. Ask yourself these questions:
- How
good are your Defect Records?
- How
pathological is your codebase?
- How
many fragile files, in absolute terms and by percentage, are in your
codebase?
- What
is the difference between collecting metrics and data mining?
- Do you
recognize the difference between metrics for root cause analysis, blame
assignment, personal gratification (usually by the requiring managers),
etc. and process control?
- How
important is the timeliness of the data?
In Summary -
There are three kinds of metrics: Control, Process and Analysis. Control
metrics are consistently used and are generally automated to determine the
state of the product. The data behind Control metrics needs to be automatically
collected and as timely as possible so the resulting metrics can be used for
decision support. Process metrics are used to determine how well a process is
being followed or whether a process improvement is successful or not. The data
behind Process metrics is generally collected on an Enterprise or Organizational level and must
be planned carefully so as to prevent unnecessary overhead. These metrics are
most often used as health indicators and rarely exceed 10. Analysis metrics are
developed and used as needed to detect trends, patterns and root causes. The
data behind Analysis metrics is most often mined from existing repositories and
tools since it is impossible to predict what question will be asked next.
All of these metrics are valuable. Each has its own purpose.
And not all have to be present to win!
Trackback(0)
 |