Root Cause Failure Analysis
by Joy LePree
     As a maintenance professional you probably spend a lot of time putting out fires: administering quick fixes to small, nuisance-type problems. It's just part of the job.
     Or is it? Did you know tending these chronic failures can eat up 80% of your maintenance budget? Well, there's a way to entirely eliminate these problems - it's called root-cause failure analysis (RCFA), and it has the potential to save your facility millions of dollars in repair costs and downtime.
     RCFA is a simple, yet disciplined process used to investigate, rectify and eliminate equipment failure, and it's most effective when directed at chronic breakdowns.
RCFA LOGIC TREE
Describe the failure event
Describe the failure modes
Hypothesize
Verify the hypotheses
Determine physical roots & verify
Determine latent roots & verify
 
     "RCFA is applicable anywhere, but what you use it on determines how well it affects your bottom line," says Robert Latino, vice-president of Reliability Center, Inc., a Hopewell, VA-based reliability engineering consulting firm that provides RCFA training to the manufacturing industry.
     "If you use it only in a reactive manner when you have a major, but sporadic event, it will solve the problem," he notes. "But you will save a lot more money if you use it on chronic failures.
     Latino has found that approximately 80% of a typical maintenance budget is stored away for chronic failures, meaning that these events cost far more, in aggregate, than major breakdowns. So it makes sense that the greatest savings comes from applying RCFA to routine breakdowns.
     Rick Kalinaukas, reliability engineering supervisor for Union Camp Paper Co., Chesapeake, VA, says his company initiated RCFA for just that reason.
     "We found that a large portion of our downtime came from small events that occurred on a very frequent basis, rather than big, sporadic one-time failures," Kalinaukas says. "Chronic items typically slipped by our system of addressing and prioritizing things because they just seemed to be inconveniences."
     Kalinaukas turned to RCFA to develop a method that addressed those chronic failures.
     "The power of the process is that it shows you how to find the latent roots responsible for the breakdown," Kalinaukas says. "Typically, most organizations have always found the broken pieces and come up with an explanation as to why it broke. But root-cause failure analysis takes you beyond that to the latent roots, which are the management system weaknesses. Once you've found these, you have the means to solve many other potential problems that haven't yet occurred."
     Latino explains that in most failures there are actually three layers. First is the physical component, then there is the human error, and finally the latent root of the problem. The latter is always the true cause of the problem.
     "Inevitably in any failure there's going to be human error, either someone did something wrong or forgot to do something," he says. "But when you get into true root-cause failure analysis, you get deep into management systems.
     "These include training mechanisms and plant policies, procedures and specifications," he notes. "People make decisions based on these, and if the system is flawed, the decision will be in error and will be the triggering mechanism that causes mechanical failure."
     A prime example of these layers can be seen in a catastrophic fan failure experienced in Chevron Corp.'s San Ramon, CA facility. Supervisor of Maintenance and Operations Walt Flannery says following a major fan breakdown, the maintenance department performed RCFA and determined that misshapen bearings were the cause. Further analysis revealed that the defect was due to neglecting to rotate the fan when it was in storage prior to installation.
     According to Latino, the bearing failure was the physical component, and the lack of rotation when in storage was the human element behind it, but the latent root cause was the facility's storage system.
     Corrective action came when the maintenance crew shared the information derived from the analysis with the folks in the storage room so they would know to have large machinery delivered on a just-in-time basis or rotated weekly to prevent future failures.
     Both Flannery and Latino stress that RCFA should be used for seeking out flaws within management systems, and not for laying blame on an individual.
     In order to reveal a faulty system, a multi-step RCFA process is necessary. The method taught by RCI consists of six steps: a failure modes and effect analysis; the preservation of failure information; the organization of an analysis team; the actual analysis; sharing the findings and making recommendations; and tracking the results.
     Of these steps, failure modes and effect analysis and the actual analysis are the most significant. The first determines which failure events represent your facility's most significant losses. Using this technique, notes Latino, you generally find that 80% of a plant's losses are represented by less than 20% of its failure events. These occurrences are referred to as the "significant few" and are the events that should receive top priority for RCFA.
     During the actual analysis, a logic tree is used to work through a failure. To employ this tool, the failure event is placed in the top block. Under that, all failure modes, or possible causes of breakdown, are listed. The following layer is reserved for hypotheses of how the various failure modes could have occurred. Next comes the verification of the hypotheses to identify which one actually led to the problem. The next steps consist of determining and verifying the physical roots, human roots and latent roots behind the failure. Keep in mind that a separate logic tree must be maintained for each analyzed event.
     Vernon Kingsbury, maintenance coordinator analyst with Lafarge Corp., Alpena, MI, says his company began using RCI's RCFA process in June 1995. One of the earliest applications of RCFA examined the frequent occurrence of a drag conveyor failure.
     For the analysis, the problem was labeled "drag conveyor failure." Failure modes were then identified -- either it was an electrical problem or a broken link.
     After determining that a broken link was the cause of breakdown, Kingsbury says the next step was to develop theories about how the link was damaged -- either overloading the conveyor or using an inadequate drag chain.
     For verification, Kingsbury says the analysts examined the items hauled by the conveyor, as well as the material of which the drag chain was constructed of. As it turns out, the chain in use was recommended by the OEM, so the problem was related to overloading.
     The next step asked "How can the conveyor be overloaded?" Several hypotheses were studied, and the problem was soon identified as sporadic placement of too much material on the conveyor.
     Kingsbury says they studied possible reasons for intermittent overloads. Finally, a lack of control over one of the plant's processes was determined to be the root-cause of the drag conveyor failure. The analysts examined the process in question and made recommendations for gaining control of it to prevent conveyor loads from overwhelming attendants in the future.
     "Identifying the real root-cause solved the problem with the drag conveyor and has had a potential savings of $193,000," notes Kingsbury. He adds that this incident was just one of many chronic failures in the facility that have been corrected through RCFA.
     "Since June of 1995, I've trained 344 people in RCFA at all of our company's different facilities," says Kingsbury. "We normally expect to save a little over $2.25 million over a two year period for every 100 people trained." The potential savings for the company are expected to be around $13 million.
     The savings realized by Chevron are also impressive. In the fan scenario alone, the company saved a fortune. Because the fan that originally broke down was one of eight kept in storage prior to placement in the facility, maintenance crews decided to run vibration tests on the other seven machines to determine if they also contained faulty bearings. The tests detected a common vibration signature that indicated possible failure of all seven in the future. So the equipment was torn down and the bearings were replaced and realigned.
     "Performing those activities cost $2,300," says Flannery. "We spent $32,000 to remedy just one catastrophic fan failure, so we saved $224,000 in repair costs alone." This figure does not include the potential savings from eliminating downtime or from educating employees about proper storage techniques.
     Just one RCFA analysis in Kalinaukas' Union Camp paper mill, saved the company $1,021,000 in production losses. The problem was a packing failure on a large piece of machinery that resulted in frequent in downtime. Originally, he says, the mechanics responded to the frequent breakdown in a reactive way by replacing the part. Eventually, a detailed root-cause failure analysis identified the problem: the wrong packing sleeve was placed in the machine during an earlier repair due to incorrect purchasing procedures.
     To solve the problem and realize the impressive savings, the packing sleeve was replaced with the right one, and a new purchasing system was initiated to insure that the wrong part will never be ordered again.
     "The paper industry is such a capital intensive industry, and our throughput rates are so phenomenally large that any small component failure causes tremendous profit loss," notes Kalinaukas. "That's what makes root-cause failure analysis an excellent process for our mill."
     Frank Meitz of the Bonita Springs, FL-based Frank J. Meitz Institute of Maintenance Management, a division of DP Solutions, Greensboro, NC, says process industries probably get the highest return immediately because they have a tremendous amount of bearings, drives, motors, and equipment powered by different applications than most other facilities.
     However, the potential savings are substantial for any industry because RCFA promotes reduction of chronic problems and increased mean time between failures in all situations. Other benefits derived from the RCFA process include improved product quality and regulatory compliance through increased reliability, better sensitivity to machinery problems, less non-value-added work and administrative obstacles, and fulfillment of the ISO 9000 certification requirement for "corrective action," according to the Reliability Center.
     But what makes RCFA especially appealing is that the potential benefits far outweigh any capital investment in the process. "This is really a very inexpensive way to save your company a lot of money and time," says Kingsbury. "When we began using root-cause failure analysis all that we really paid for was the training and the text we use."
     The training course offered by RCI runs about $1,000 and the text is about $200, according to Kingsbury. He says one trained person in the facility can pass the knowledge and methods onto the rest of the plant.
     Lafarge's training program is set up very simply. A group is gathered off-site for a one day session and Kingsbury teaches them RCFA methodology. About four to six weeks later there is a reunion where the students take an actual problem they've encountered in the facility and perform RCFA. At that time the group graduates and the results of the analysis are used to implement any necessary changes in management systems. If it sounds too easy to be effective, consider the fact that the drag conveyor failure was remedied in one of the earliest training sessions.
     After training, most RCFA users find that they can get by without purchasing a lot of new tools or technology. "It has really been a relatively no cost process for us," says Kalinaukas. "We haven't had to purchase any additional equipment because we can use equipment that's already in place, like vibration monitoring."
     However, it should be noted that unless your facility has an in-house lab for things like metallurgical examinations, you may have to turn to an outside source for some analyses. "Depending on a plant's in-house capability to perform internal measuring and analysis, support from an external service may be needed," says Meitz. Finite element analysis, physical modeling, rotor dynamic and metallurgical analysis, says Latino, are the most common tests required for RCFA.
     "In some component failure situations, we'll have items professionally analyzed against original design," says Kalinaukas. "We might send out a sample of packing sleeve material and ask for a professional opinion as to how that component failed.
     "We then receive a formal report stating that a bearing failed due to a lack of lubrication and then we can continue with our RCFA process," he says. "For the most part we use external expertise to back up our conclusions."
     Kingsbury's Lafarge also employs external sources. "We do use outside services to identify some problems, but not that often because not every theory needs outside expertise to be proven," he notes. "In most cases hypotheses can be verified in-house with the tools already available."
     Chevron, says Flannery, is lucky to have its own in-house research and technology lab for destructive and non-destructive failure testing. When the fan failed, bearings were sent to the lab for metallurgical testing.
     Despite the cost of outside services, there will still be a worthwhile return on investment. "You should expect three things to happen," says Meitz. "Overall maintenance costs should be reduced because the maintenance staff will be able to do more work through proper planning and scheduling, rather than reactive maintenance."
     He continues: "Based upon the institution of RCFA, investment in spare parts should be reduced, which will bring a recurring savings on carrying costs. Still, the biggest savings will come from increased use of assets. These three occurrences should really support the input of RCFA."
Use RCFA to investigate and 
eliminate equipment problems. 
It's most effective when directed 
at chronic failures.
 
     With all the benefits of RCFA - namely putting an end to chronic failures and nuisance jobs and saving millions of dollars in maintenance costs - is there really any reason not to consider initiating RCFA in your plant?
RCI Offers the full range of Reliability Consulting Services and Training Programs for Industry. We conduct facilitations, reliability assessments, FMEA & Root Cause Failure Analysis Training - Public & On-Site.
For more information contact:
Reliability Center, Inc.
P.O. Box 1421
Hopewell, Virginia 23860
Phone: (804) 458-0645
Fax: (804) 452-2119
Website: http://www.reliability.com
Return to Failure Analysis Reference Library Index
© Copyright 2000 Maintenance Resources, Inc.
Phone: 812.877.7119  -  Fax: 812.877.7116  -  E-Mail: info@maintenanceresources.com
Address: 1983 North Hunt Street  -  Terre Haute, IN 47805