The Power of Failure Analysis to Eliminate Process Interruptions
presented by
Charles J. Latino, President & Founder of Reliability Center, Inc.
Paper Industry Maintenance Conference
October 20-24, 1997
     Our job is to run our plants with the absolute minimum of planned outages without unplanned shutdowns producing high quality products at the lowest possible cost. If we accomplish this, we are fulfilled. We know that we have done the job.
     To produce with an absolute minimum of planned outages and no unplanned outages, it follows that our operating and support staff have to be focused on the right priorities. Let us review the reliability formula:
wpe2.gif (2534 bytes)
     Since maintainability is a small factor compared to reliability, we have to conclude that our priority has to be Reliability or those actions that will eliminate the causes of downtime. This does not suggest that it is inappropriate to determine means to predict failures and thus limit downtime. What it says, is that given limited resources you will get more for your expense dollar by eliminating failures than limiting damage after primary or component failure has occurred.
     The next question has to be how do we eliminate failures? We have two means to accomplish this, one at the front end of the run cycle and one at the back end. On the front end, we need to institute a precision paradigm. One that says we will strive to do every task once and do it right the first time. If we limit this concept to machinery it will be very difficult to establish it as a paradigm. It is in our best interest to have precision in administrative activities and in our manufacturing processing as well as building it in when we work on machinery. We have all seen what happens when crime labs do not act with precision. Criminal cases are put in jeopardy. Lack of precision in industrial labs might have more dire results, certainly for us. Procedures that are not thoroughly thought out can cause disasters, loss of quality, and drain money depending on their application. 
     Respecting machinery, Mobil Corporation ran a precision built BMW with their synthetic oil providing precision maintenance, as delineated in the owner's manual. One million miles were run before the car was dismantled and all the engine parts measured for wear. This is the equivalent of 66 years of normal driving. They found that the vast majority of the parts showed no wear and a few had very slight wear. So much for the statement that we often hear that bearings have a very limited life. Indeed, the very concept of running a bearing is that except for that split second at startup, there should be no metal to metal contact if clearances are precision designed and built. 
     On the other end of the run cycle, we generally run into failure of one sort or another. Many will be mechanical or process related. Some will be failures to obtain sales and provide orders and others will be failures to build in precision. This latter failure reduces our confidence that our machine can run for a long period of time making it necessary for us to take planned outages. All of these failures can be resolved, and most often eliminated, by using failure analysis. 
     To get the maximum amount of leverage from failure analysis, it must be applied on two tracks. On one track, Root Cause Failure Analysis is practiced to eliminate and/or mitigate those failures that represent 80% of a facilities losses in one focused parameter such as uptime or quality. These are called the Significant Few Failures because they generally represent less than 2O% of a facility?s total focused failures in that parameter. 
     It may seem obvious which failures represent the Significant Few, but experience has demonstrated that they are usually not readily apparent. To help the process of identification, it is suggested that facilities employ a Modified Failure Modes and Effects Analysis. When the application of Failure Modes and Effects Analysis (F\MEA) utilizes historical information on failures instead of perspective or potential failure probabilities, the analysis tool becomes simple to use and a much more powerful failure identification tool. 
     Many believe that the most significant failures are those sporadic events that result when a plant experiences major trauma like a fire, explosion or the destruction of major component in key machinery. As dramatic as these can be, they seldom compare to the losses experienced by chronic mechanical and process failures occurring hourly, daily, weekly and monthly in our plants refineries and mills. Certainly, a $10,000,000 machinery breakdown failure that occurs once every 10 to 20 years is not a small incident. However, when amortized over 10 to 20 years it pales when compared to the chronic failures that keep occurring with a frequency that is so high that many of them become accepted as a cost of doing business. 
     Consider a mining operation where conveyors carrying ore to surface treatment plants stopped an average of four times in an eight hour shift. This was not unusual as the belts are equipped with safety trip lines to protect personnel and ore occasionally falls from the belts on the trip lines. Since mining operations have miles and miles of conveyors, vehicles are equipped with radios to direct drivers to tripped belts. In the vast majority of trips, the drivers simply restart the belts. Downtime experienced is an average of 15 minutes/trip. This was generally accepted as a cost of doing business. However, when it was pointed out that the trips represented 12.5% of available uptime, management looked at the problem differently. 
     Once the candidates for Root Cause Failure Analysis are delineated, the process of analysis begins. Teams are carefully selected, success is defined and clear criteria for recommendations are solicited from management. Careful and comprehensive work is planned and executed to secure failure information. These preliminary steps are vital to successful analysis. 
     A cigarette manufacturer experiencing millions in dollar losses due to cigarette rod breaks had to define the characteristics of each break to determine break patterns. Six months of observation and high speed photography identified 5 distinctive breaks each associated with different areas in the machinery. 
     It is to be remembered that when performing Root Cause Failure Ana1ysis, we are working on 80% of our losses. Usually, we will save millions of dollars. It is certainly worth the time and money to neutralize these losses. 
     Analysis is secured through a reiterative process of defining hypotheses, through deductive and inductive thinking, followed by verification. Most of the time, plant experts have a difficult time solving chronic failures. They feel they do not have the time to devote to a single problem so they find themselves adapting solutions without proper analysis. We have found that 50% of the time, the quick fix does not work. Prove this to yourself by merely asking what your maintenance force does each day. Most of them are doing what they did yesterday and last week and last month. 
     Most of the time chronic failures are accepted as a cost of doing business because each incident is usual1y not very dramatic in terms of cost or time lost. What is, of course, missing, in this thinking, is frequency. While one incident may have a small impact the combination of several similar incidents can be very large. This is why Failure Modes and Effects Analysis is so important. It takes into account frequency of occurrence as well as Impact.
     Plant experts generally shy away from verifying hypothesis as they work their way down on a logic tree. They feel it is not necessary because they already know the answer. When we are engaged to guide an analysis the experts report that they learn a great deal through verification. It was verification that established the path that leads the way to root cases. 
     When a logic tree of causes is driven down through the mechanical, process human, and organization or management systems causes, we can turn to solutions that will eliminate that problem from ever happening again. What?s more the fix can, most often, be extended to other systems in the facility. When that is done, we increase MTBFs and hence, the reliability and availability of our manufacturing systems. 
     Earlier, I said that failure analysis should be practiced on two tracks. Root Cause Failure Analysis is one of the tracks. It is so important because we find real solutions to those l0% to20% of failures that represent 80% of our losses. What is left out is the failures that represent the other 20% of focused losses, the failures that are not our primary focus and those administrative blocks or failures that prevent people from doing their best work. To fully exploit the potential inherent in failure analysis we must address these failures also. 
     We call our second investigative path the Failure Analysis/Problem Solving Track. This track is reserved for mechanics, operators and supervisors, people very close to the business of producing product. They know the impediments to productivity, they face them daily. 
     To employ field people in this type of work, the methods to be used need to be designed for them. The method must provide something fulfilling for the field analysts. Time limitations, because of other duties, precludes the in-depth analyses utilized when Root Cause Failure Analysis work is performed. Supporting this work with field employees also needs to be accommodated. 
     We have found that best results are obtained when field analysts are allowed to decide for themselves which failures or problems need to be resolved to allow them to do their best work. Field analysts are provided with a priority technique to select those failures or problems that have the highest impact on their work and are the easiest to accomplish. 
     Analysis is performed using a logic tree just as is done when Root Cause Failure Analysis is performed. However, validation routines are less stringent. This means that the confidence one can place in the results will be somewhat less. However, compared to the troubleshooting or trial and error methods commonly used, the gain in valid answers to failures is truly astonishing.
     When Failure Analysis/Problem Solving is utilized and supported, we have found that even using very reasonable assumptions, manufacturing facilities ought to be able to realize benefits of at east $2,250,000 for every hundred people trained. Considering that both the cost of the training and the direct support will be $287,000, the returns expected are in the area of 800%. These are gold mines that await discovery. You will want to review the assumptions that support this extraordinary return. They are:
    Out of 100 people trained, it is expected that only 30 will initially participate. However, this 30 will form a critical mass that will draw in others as successes begin to accrue.
    Only 10% to 15% of work time will be used to work on failure analysis and problem solving. For the critical mass of 30 people, this amounts to roughly 3 additions to staff. In the economics, this is figured at $60,000 per person.
    Field Analysts will prefer to work in teams so the assumption is that there will be 15 teams of two working on failures and problems.
    Each team of two working on the average 12.5% of their available time will be able to complete only 6 analyses/year. This is a conservative assumption.
    The average analysis will yield only $25,000 in direct cost reduction and documented opportunities like more uptime to produce product. This is the most conservative assumption.
    Another cost element has been included but it needs explanation. You may or may not know that more than $50 billion are spent each year in the United States on industrial training. 
    Studies vary, but between 1% and 20% indicate that of this is training translated into changed behaviors on the job. In other words, most training dollars are wasted. We have studied existing transfer environments and have concluded that most facilities have paradigm blocks that preclude successful transfers. Accordingly, we have included in our economics the cost of one training/mentor for ever; 100 trainee analysts at $7O,OOO/trainer/mentor.
     You can deduce that I am an advocate of trainers not merely providing class instructions but also mentoring the students that want to participate until students feel comfortable with their new skills. 
     The solutions that result from in-depth Root Cause Failure Analysis and Failure Analysis/Problem Solving will, without question, increase reliability to heights not imagined. The manager that wants to provide consistently good operation and product flow can have it. The documentation supporting this claim is mounting. Don?t be left behind. 
     Before I conclude this paper, I would be remiss if I did not express one caveat. The claims in this paper are real but to obtain them, traditional thinking has to change. For example, management has to set out the expectation, but then they have to serve their employees in the attainment of those objectives This is much easier to write than to do. The gold is available only to those that have the will, courage and perseverance to win it.
RCI Offers the full range of Reliability Consulting Services and Training Programs for Industry. We conduct facilitations, reliability assessments, FMEA & Root Cause Failure Analysis Training - Public & On-Site.
For more information contact:
Reliability Center, Inc.
P.O. Box 1421
Hopewell, Virginia 23860
Phone: (804) 458-0645
Fax: (804) 452-2119
Website: http://www.reliability.com
Return to Failure Analysis Reference Library Index
© Copyright 2000 Maintenance Resources, Inc.
Phone: 812.877.7119  -  Fax: 812.877.7116  -  E-Mail: info@maintenanceresources.com
Address: 1983 North Hunt Street  -  Terre Haute, IN 47805