The Essentials of Conducting a Successful
Root Cause Failure Analysis
written by Robert J. Latino, Reliability Center, Inc., and presented at the
Paper Industry Management Conference in October 1998 in Atlanta, Georgia
Abstract: Industry statistics show that approximately $60 Billion is spent annually on industrial training. Many firms are involved and competent in providing training in the area of Root Cause Failure Analysis (RCFA) . But how come the same statistics show that only about 20% of the people trained ever utilize their new learning in the field, thus providing any returns? We will discuss why more than classroom training is essential to the success of any RCFA.
There are four key items that are necessary for successfully conducting an RCFA, they are:
The Student/Analyst
The RCFA Method
The Training
The Work Environment

The Student / Analyst
     Lets start with the individual. Who are the typical candidates for leading a RCFA? It has been my experience that more times than not, when a high stress situation arises around a failure, the plant management will assign a task team led by the recognized expert in the plant. This is a commonly held paradigm. However, when we further explore the flaws in this logic, we find that when experts LEAD analyses, they have a tendency to already know what the conclusions will be. Hence, as the leader of the team, they tend to facilitate the team?s activities towards their predetermined conclusions. Team members are usually too intimidated to ask apparent questions openly, as they feel the expert should know and "?I do not want to appear stupid!". This scenario has stifled many investigations and forced the spending of millions of dollars on recommendations that do not eliminate the "true" root causes.
     Being a principal analyst of a RCFA should be purely a facilitating role, not a participating role. Individuals that lead such teams must be unbiased, nothing to gain or lose by the findings. This is important because oftentimes the stakeholders in the outcome are the leaders of the analysis. This means that they might have a tendency to manipulate the outcome to satisfy their standing in the facility. If they are unbiased, they can ask any question that they want to the team of experts because they are not expected to know everything. I am continually amazed at how often the "experts" cannot answer the most obvious questions of the non-experts and back it up with facts! It is another paradigm that the "expert" knows everything. If that were the case, there would not have been a failure!
     In assuming this role as an unbiased facilitator of an RCFA, it is imperative that such individuals posses certain personality traits. The following are just a few key ones:
Persistence in overcoming barriers, versus yielding to them.
A will to WANT to do this work, not HAVE too.
Resourceful in doing "Whatever It Takes" to get to the facts.
Diplomacy in dealing with various departments in a tactful but candid manner.
Thick-skinned in facilitating several "experts? on the same team.
     The individual chosen to train and lead RCFA?s in the field is the first essential step in conducting a successful RCFA.
The RCFA Method
     All of the RCFA methods and techniques on the market represent various means to attain a common end; accurately determining the root causes of an undesirable event. The fact of the matter is that no matter what technique is employed to analyze an event, the underlying theory of cause and effect relationships will apply. All undesirable events are the result of a series of human errors that queue up in a particular sequence. All of the various training and automation products available on the market are merely different graphical representations that depict the perceived chain of errors that lead to the undesired outcome. Knowing this, it becomes the analyst?s responsibility to evaluate the various RCFA methods and automation tools available to meet their facility?s analytical method of choice.
We utilize a process called PROACT? . This is an acronym that stands for the following: 
PReserving Failure Data
Ordering the Analysis Team
Analyzing the Data
Communicating Findings and Recommendations
Tracking for Bottom-Line Results
Preserving Failure Data
No matter what the nature of the failure or loss, trying to solve a failure with little or no data is like a detective trying to solve a crime with no evidence or leads. Any failure will leave clues as to its sequence of events that lead to its surfacing. Typical failure data includes parts from the failure scene, positioning of where parts and people were, timing of events and paper data such as DCS information, specs, procedures and the like. 

Depending on the circumstances, some data is more fragile than others. For instance, what is the likely type of data to be disturbed the quickest at a failure scene? More than likely, the production pressures set in and as a part of general clean up operations, the positions of where things were are lost forever. Such data is extremely important to an analysis.
 

Ordering the Analysis Team
It is a common belief that when a failure occurs that the correct course of action is to assemble a team of experts, sit them in a secluded room and days later they will come out with the answers. While this may work for some, I have seen it fail miserably. I recall one failure we were involved with, where a certain bundle of tubes would rupture every year in the same location. This problem persisted for 10 years! Every time the failure occurred the natural reaction was to assemble a team of metallurgists to analyze. Every time the metallurgists analyzed the failure their resolution was metallurgical. This is predictable and to be expected.

However, when we were called upon to assist, we sent an aerospace engineer in to lead the analysis. He knew very little about boilers. That made him the perfect candidate for the lead role. When you are not expected to have all the answers, you can ask any question you want without ridicule. After just a couple of sessions with the experts, they determined that the tubes were in an area of the boiler that was below the dew point of sulfuric acid and that the remedy was to move the tubes over 18" and return to the base metals.
 

Analyzing the Data
When you have an ideal team put together and a good data collection strategy, then you need a means to logically deduce what the data is telling you. All failures are the result of a string of cause and effect relationships. Of the numerous RCFA methods on the market, they must accept this fact. The only difference between the various methods is how they develop and graphical represent the logic sequence that lead to failure. We have heard all the "buzzwords" such a fishbone, fault tree, why tree and the like. They all represent a means by which to sort out the failure data and determine the sequence of errors that lead to failure.

All of these methods represent how the mind utilizes deductive logic to draw conclusions. However, the conclusions must be based on fact and not assumptions. This is where some RCFA methods may differ. True "Root Cause" Failure Analysis will identify not only the physical causes of failure, but also the flawed human decisions that lead to errors of omission and errors of commission. The true roots are not in "whodunnit" but in why they made the decisions that they made. This will uncover what we call organizational system roots. Things such as flawed procedures, training systems, purchasing systems and the like are examples of organizational system causes.
 

Communicate Findings and Recommendations
No matter what method of analysis is employed, if the approved recommendations are not acted on then it was a waste of time and money to perform the analysis. I am sure that many agree, that you have had good projects on the table that were approved but never got any further than that. Another barrier is that sometimes these RCFA recommendations are "low priority" items in a reactive work order system. Therefore, if the recommendations are to be executed, something has to change in the work order system to raise their priority. 

Communicating RCFA results is of the utmost importance to your organization because more than likely others in your company can benefit from the information. Chances are that there are similar systems in other parts of your plant or at sister plants that have the same problems. Therefore, if someone has already performed an analysis, then you do not have to go through one yourself. The analysis actually becomes an expert system, a troubleshooting thought process on paper.
 

Tracking for Results
No analysis is successful if you implemented corrective actions and nothing improved. We cannot be successful unless we measure an indicator of our success. Some believe that a successful RCFA is the identification of root causes. Some think it?s the acceptance of recommendations, but the fact is the true measure is that the failure does not recur. RCFA analysts are essentially in the business of "eliminating the need to do reactive work!". 

This process, just like a police detective in the field, embeds the essential elements necessary to conduct the analysis itself and arrive at "solid, factual" conclusions. Methods that are accompanied by automated tools or software, allow more analyses to be performed in a given time period. This is because the RCFA cycle time is minimized due to many of the administrative tasks being organized by the software.

The Training
Training is a very good tool, ?when used properly! Oftentimes training is used as a therapeutic tool to say that they have trained their people in a certain subject. How many people have ever been to safety training that is measured by the number of hours you spend in a seat to meet compliance with government regulatory agencies? What benefit did the company receive in return for spending the money on the class and taking you out of the field for that period of time? I have never understood why engineering projects are almost always measured on their Return On Investment (ROI) and training is rarely measured against the same rigid criteria. 

This brings us to another commonly held paradigm, that if you are trained in a classroom, you are not expected to immediately use that training in the field to generate income for the organization. Training in itself, is usually not enough to be successful.

What is necessary for your training dollars to be spent wisely? All training efforts should encompass the following design considerations:
 

Have a Reason to Train - Outline specifically what is the desired behavior change you wish the students to exhibit when completed.
Provide Knowledge Training - This is the classroom lecture portion that is intended to transfer knowledge verbally.
Provide Skills Training - This is the classroom exercises that are designed to apply the knowledge learned in a team environment.
Set Expectation of Field Use - This is the management support portion where management communicates to students specific expectations of the use of the new skill in the field and quantifiable results within a given time frame.
Measure Performance - How do we know if the training was successful? We expect the desired behavior to produce some positive effect on the bottom-line. What is that metric and did we attain it?
     Students have a tendency to be more attentive if they know they will HAVE to demonstrate the skill in the field and produce results. Also students will tend to pay more attention if they feel they are not wasting their time in another "program of the month" class that will disappear in six (6) months. Effective training is another essential step in conducting a successful RCFA.
The Work Environment
     How often have we all been to classes and learned interesting things, gone back to the field and tried to implement the new learning and run across administrative barriers that discourage using your new skill? Don?t we frequently face a feeling that we go back into the same reactive environment and nobody cares if I implement my new skill or not.
     The most common objection I hear from my students is, "I do not have time to do RCFA?". Think about this oxymoron for a moment. My common retort to this is, "Why don?t you have the time?". People are so busy fire fighting in the field that they do not have the time to use their creativity to figure out how to eliminate the risk of recurrence of the failure.
     Managers must start to take an active role in what concepts and practices that their people are being trained in. This means attending course overviews of the classes that are being taught. How can management support something that they have never seen? Managements who do attend such overview courses are of the mind that their students will be able to solve the world?s problems in three (3) days time. The fact of the matter is that they will know how to solve failures, but in order to arrive at accurate conclusions, it takes time to prove hypotheses. Can we expect NTSB investigators to solve the reason an airplane crashes in three (3) days? Validating hypotheses to arrive at facts is the time consuming part of RCFA.
     People trained in RCFA often feel like they are on an island by themselves because they are going against the grain. They are trying to perform a "proactive" task in a reactive environment. Management support is a must in RCFA. Managers should consider the following support considerations essential to conducting a successful RCFA:
Provide time to perform the analyses.
Provide resources to validate hypotheses (expertise, labs, etc.).
Provide recognition for successful analysts.
Provide changes to work order systems that ensure that proactive recommendations are implemented and not put on the "back burner"
Demonstrate support by kicking off RCFA classes.
Write letter of expectation to RCFA students.
Demand results.
     The combination of these four (4) elements are essential to conducting a successful RCFA. Remember the Chinese Definition of Insanity, "When we do the same thing over and over again, and expect a different result!".
Robert J. Latino is Vice-president of Strategic Development and a Senior Consultant for Reliability Center, Inc. Mr. Latino is a practitioner of root cause failure analysis in the field with his clientele as well as an educator. Mr. Latino is an author of RCI's Root Cause Failure Analysis Methods course and co-author of Failure Analysis/Problem Solving Methods for Field Personnel. Mr. Latino has been published in numerous trade magazines on the topic of failure analysis as well as a frequent speaker on the topic at trade conferences. He can be contacted at 804/458-0645 or blatino@reliability.com.
RCI Offers the full range of Reliability Consulting Services and Training Programs for Industry. We conduct facilitations, reliability assessments, FMEA & Root Cause Failure Analysis Training - Public & On-Site.
For more information contact:
Reliability Center, Inc.
P.O. Box 1421
Hopewell, Virginia 23860
Phone: (804) 458-0645
Fax: (804) 452-2119
Website: http://www.reliability.com
Return to Failure Analysis Reference Library Index
© Copyright 2000 Maintenance Resources, Inc.
Phone: 812.877.7119  -  Fax: 812.877.7116  -  E-Mail: info@maintenanceresources.com
Address: 1983 North Hunt Street  -  Terre Haute, IN 47805