A disciplined and aggressive closed-loop FRACAS is an essential element in the TAAF process. The essence of a closed-loop FRACAS is that failures and faults of both hardware and software are formally reported, analysis is performed to determine failure cause, and positive corrective actions are identified, implemented, and verified to prevent further recurrence.
For a successful TAAF process, all failures must be analyzed to the extent needed to determine the root cause of the failure. In many instances, this will not require a detailed laboratory analysis because the cause of the failure, such as test procedure errors, wrong parts, overstressed parts, workmanship errors, etc., will be readily apparent. Likewise, the corrective actions for many of these failure types will be relatively straightforward and easily implemented. Determining the root cause of more complex failures and developing corrective actions early in the development process are crucial, even though the process may be costly and time consuming. However, the cost and time will be considerably less than waiting until later in the acquisition program to correct the failure.
Corrective action options and flexibility are greatest during the early design evolution when even major design changes can be considered in order to eliminate or to significantly reduce susceptibility to known failure models. These options and flexibility become more limited and expensive to implement as a design becomes firm. Early elimination of failure modes, and thus early implementation of a good FRACAS, has the following advantages:
Cost and schedule savings.
Ample time to assess corrective actions.
Reduction of previously identified failure modes; reducing redundant data analysis.
Adequate time to address all failures prior to full rate production (ie. prevention of corrective action backlog).
If RDT results are to be interpreted correctly, all test conditions and occurrences must be recorded accurately and completely. A solution and key complement to FRACAS is the test log because a major source of problems is the dynamic status of test item configurations. When multiple copies of an equipment item are under test this is especially significant. By definition, if reliability growth is taking place, the equipment is changing (normally both hardware and software.) In addition, temporary repairs/replacements usually are permitted so that testing may continue while permanent fixes are being explored. The implications of a failure differ both quantitatively and qualitatively between repairs and fixes--a repeat failure after repair simply provides more data on the same phenomenon, while a repeat after "fix" may Invalidate the corrective action. An accurate test log can prevent misinterpretation of results.
It is recognized that there are pragmatic limits to the resources in time, money, and engineering manpower to expend on an analysis of a particularly complex failure mode or the implementation of preferred corrective actions. These limits are determined by item priority, program urgency, available technology, and engineering ingenuity. The limits will vary from program to program. Management involvement is required to determine these limits.