9.7.4 __General Methodology for
Software Failure Data Analysis__

A step-by-step procedure for software failure data analysis is
shown in Figure 9.7-2 and described below:

__Step 1: Study the failure data__

The models previously described assume that the failure data
represent the data collected after the system has been integrated and the
number of failures per unit time is statistically decreasing. If, however,
this is not the case, these models may not yield satisfactory results.
Furthermore, adequate amount of data must be available to get a satisfactory
model. A rule of thumb would be to have at least thirty data points.

__Step 2: Obtain estimates of
parameters of the model__

Different methods are generally required depending upon the type
of available data. The most commonly used ones are the least squares and
maximum likelihood methods.

__Step 3: Obtain the fitted model__

The fitted model is obtained by first substituting the estimated
values of the parameters in the postulated model. At this stage, we have a
fitted model based on the available failure data.

__Step 4: Perform goodness-of-fit test__

Before proceeding further, it is advisable to conduct the
Kolmogorov-Smirnov goodness-of-fit test or some other suitable test to check
the model fit.

If the model fits, we can move ahead. However, if the model does
not fit, we have to collect additional data or seek a better, more appropriate
model. There is no easy answer to either how much data to collect or how to
look for a better model. Decisions on these issues are very much problem
dependent.

__Step 5: Computer confidence regions__

It is generally desirable to obtain 80%, 90%, 95%, and 99% joint
confidence regions for the parameters of the model to assess the uncertainty
associated with their estimation.

__Step 6: Obtain performance measure__

At this stage, we can compute various quantitative measures to
assess the performance of the software system. Confidence bounds can also be
obtained for these measures to evaluate the degree of uncertainty in the
computed values.