Concepts: Key Measures of Testing
The key measures of a test include coverage and quality.
Test coverage is the measurement of testing completeness, and is based on the coverage
of testing, expressed either by the coverage of test requirements and test cases, or the
coverage of executed code.
Quality is a measure is of reliability, stability, and the performance of the target of
test (system or application-under-test). Quality is based upon the evaluation of
test and the analyses of defects discovered during the testing.
Coverage 
Coverage metrics provides answers to the question "How complete is the
testing?" The most commonly used coverage measures are requirements-based and
code-based test coverage. In brief, test coverage is any measure of completeness
with respect to either a requirement (requirement-based) or the code's design /
implementation criterion (code-based), such as the verification of use cases
(requirement-based) or execution of all lines of code (code-based).
Any systematic testing activity is based on at least one test coverage strategy. The
coverage strategy guides the design of test cases by stating the general purpose of the
testing. The statement of coverage strategy can be as simple as verifying all performance.
If the requirements are completely cataloged, a requirements-based coverage strategy
may be sufficient for yielding a quantifiable measure of testing completeness. For
example, if all performance test requirements have been identified, then the test results
can be referenced to get measures, such as 75 percent of the performance test requirements
have been verified.
If code-based coverage is applied, test strategies are formulated in terms of how much
of the source code has been executed by tests. This type of test coverage strategy is very
important for safety-critical systems.
Both measures can be derived manually (equations given below), or may be calculated by
test automation tools.
Requirements-based test coverage
Requirements-based test coverage is measured several times during the test life cycle
and provides the identification of the test coverage at a milestone in the testing life
cycle (such as the planned, implemented, executed, and successful test coverage).
- Test coverage is calculated by the following equation:
Test Coverage = T(p,i,x,s) / RfT
where:
T is the number of Tests (planned, implemented, executed, or successful) as expressed as
test procedures or test cases.
RfT is the total number of Requirements for Test.
- In the Plan Test activity, the test coverage is calculated to determine the planned test
coverage and is calculated in the following manner:
Test Coverage (planned) = Tp / RfT
where:
Tp is the number of planned Tests as expressed as test procedures or
test cases.
RfT is the total number of Requirements for Test.
- In the Implement Test activity, as test procedures are being implemented (as test
scripts), test coverage is calculated using the following equation:
Test Coverage (implemented) = Ti / RfT
where:
Ti is the number of Tests implemented as expressed by the number of
test procedures or test cases for which there are corresponding test scripts.
RfT is the total number of Requirements for Test.
Successful Test Coverage (executed) = Ts /
RfT
where:
Ts is the number of Tests executed as expressed as test procedures
or test cases which completed successfully and without defects.
RfT is the total number of Requirements for Test.
Turning the above ratios into percentages allows the following statement of
requirements-based test coverage:
x% of test cases (T(p,i,x,s) in the above equations) have been
covered with a success rate of y%
This is a meaningful statement of test coverage that can be matched against a defined
success criteria. If the criteria have not been met, then the statement provides a basis
for predicting how much testing effort remains.
Code-based test coverage
Code-based test coverage measures how much code has been executed during the test,
compared to how much code there is left to execute. Code coverage can either be based on
control flows (statement, branch, or paths) or data flows. In control-flow coverage, the
aim is to test lines of code, branch conditions, paths through the code, or other elements
of the software's flow of control. In data-flow coverage, the aim is to test that data
states remain valid through the operation of the software, for example, that a data
element is defined before it is used.
Code-based test coverage is calculated by the following equation:
Test Coverage = Ie / TIic
where:
Ie is the number of items executed expressed as code statements,
code branches, code paths, data state decision points, or data element names.
TIic is the total number of items in the code.
Turning this ratio into a percentage allows the following statement of code-based test
coverage:
x% of test cases (I in the above equation) have been covered with a success rate of y%
This is a meaningful statement of test coverage that can be matched against a defined
success criteria. If the criteria have not been met, then the statement provides a basis
for predicting how much testing effort remains.
Quality 
While the evaluation of test coverage provides the measure of testing completion, an
evaluation of defects discovered during testing provides the best indication of software
quality. Quality is the indication of how well the software meets the requirements, so in
this context defects are identified as any failure to meet the system requirements.
Defect evaluation may be based on methods that range from simple defect counts to
rigorous statistical modeling.
Rigorous evaluation uses assumptions about the arrival or discovery rates of defects
during the testing process. A common model assumes that the rate follows a Poisson
distribution. The actual data about defect rates are then fit to the model. The resulting
evaluation estimates the current software reliability and predicts how the reliability
will grow if testing and defect removal continue. This evaluation is described as
software-reliability growth modeling and is an area of active study. Due to the lack of
tool support for this type of evaluation, you should carefully balance the cost of doing
it with the value it adds.
Defects analysis means to analyze the distribution of defects over the values of one or
more the parameters associated with a defect. Defect analysis provides an indication of
the reliability of the software.
For defect analysis, there are four main defect parameters commonly used:
- Status the current state of the defect (open, being fixed, closed, etc.).
- Priority the relative importance of this defect having to be addressed and resolved.
- Severity the relative impact of this defect. The impact to the end-user, an
organization, third parties, etc.
- Source where and what is the originating fault that results in this defect, or what
component will be fixed to eliminate the defect.
Defect counts can be reported as a function of time, creating a Defect Trend
diagram or report, defect counts can be reported as a function of one or more defect
parameters, like severity or status, in a Defect Density report.
These types of analysis provide a perspective on the trends or distribution of defects
that reveal the software's reliability, respectively.
For example, it is expected that defect discovery rates will eventually diminish as the
testing and fixing progresses. A threshold can be established below which the software can
be deployed. Defect counts can also be reported based on the origin in the implementation
model, allowing detection of "weak modules", "hot spots", parts of the
software that keep being fixed again and again, indicating some more fundamental design
flaw.
Defects included in an analysis of this kind have to be confirmed defects. Not all
reported defects report an actual flaw, as some may be enhancement requests, out of the
scope of the project, or describe an already reported defect. However, there is value to
looking at and analyzing why there are many defects being reported that are either
duplicates or not confirmed defects.
Defect Reports 
The Rational Unified Process provides defect evaluation in the form of three classes of
reports:
- Defect distribution reports allow defect counts to be shown as a function of one or two
defect parameters.
- Defect age reports are a special type of defect distribution report. Defect age reports
show how long a defect has been in a particular state, such as Open. In any age category,
defects can also be sorted by another attribute, like Owner.
- Defect trend reports show defect counts, by status (new, open, or closed), as a function
of time. The trend reports can be cumulative or non-cumulative.
- Test results and progress reports show the results of test procedure execution over a
number of iterations and test cycles for the application-under-test.
Many of these reports are valuable in assessing software quality. The usual test
criteria include a statement about the allowable numbers of open defects in particular
categories, such as severity class. This criterion is easily checked with a defect
distribution evaluation. By filtering or sorting on test requirements, this evaluation can
be focused on different sets of requirements.
To be effective producing reports of this kind normally requires tool support.
Defect Status Versus Priority
Each defect should be given a priority; usually it is practical to have four priority
levels:
- Resolve immediately
- High priority
- Normal queue
- Low priority
Criteria for a successful test could be expressed in terms of how the distribution of
defects over these priority levels should look. For example, to a successful test criteria
might be no Priority 1 defects and fewer than five Priority 2 defects are open. A defect
distribution diagram, such as the following, should be generated:

It is clear that the criterion has not been met. Note that this diagram needs to
include a filter to show only open defects as required by the test criterion.
Additional useful Defect Density Reports: 
Defect Status Versus Severity
Defect severity reports show how many defects there are of each severity class (for
example: fatal error, major function not performed, minor annoyance).
Defect Status Versus Location in the Implementation Model
Defect source reports show distribution of defects on elements in the implementation
model.
Defect Age
Defect age analyses provide good feedback on the effectiveness of the testing and the
defect removal activities. For example, if the majority of older, unresolved defects are
in a pending-validation state, it probably means that not enough resources are applied to
the re-testing effort.
Trend Reports
Trend reports identify defect rates and provide a particularly good view of the state
of the testing. Defect trends follow a fairly predictable pattern in a testing cycle.
Early in the cycle, the defect rates rise quickly. Then they reach a peak and fall at a
slower rate over time.

To find problems, the project schedule can be reviewed in light of this trend. For
example, if the defect rates are still rising in the third week of a four-week test cycle,
the project is clearly not on schedule.
This simple trend analysis assumes that defects are being fixed promptly and that the
fixes are being tested in subsequent builds, so that the rate of closing defects should
follow the same profile as the rate of finding defects. When this does not happen, it
indicates a problem with the defect-resolution process; the defect fixing resources or the
resources to re-test and validate fixes might be inadequate.

The trend reflected in this report shows that new defects are discovered and opened
quickly at the beginning of the project, and that they decrease over time. The trend for
open defects is similar to that for new defects, but lags slightly behind. The trend for
closing defects increases over time as open defects are fixed and verified. These trends
depict a successful effort.
If your trends deviate dramatically from these, they may indicate a problem and identify
when additional resources may need to be applied to specific areas of development or
testing.
When combined with the measures of test coverage, the defect analyses provide a very
good assessment on which to base the test completion criteria.
Performance Test Measures 
This section still under construction.
|