Why Your RCA Effort Is Doomed (Unless You Fix These Issues)

by Bob Latino | Articles, Maintenance and Reliability, Root Cause Analysis

RCA

Most of us have all worked at places with some degree of a Root Cause Analysis (RCA) effort. They likely all defined and practiced ‘RCA’ differently, but nonetheless, they had something called RCA.

Most facilities have an RCA effort, but few make it truly effective.

What made one facility better at it than another? Why was one facility’s RCA more effective than another?

Having been in this space for 38+ years, this has been quite frustrating for me to watch. As consultants, our value is like ‘industrial espionage,’ but not in a criminal way. We are fortunate to observe how people operate in different industries, geographical regions, and cultures. We see a cross-section of how people and processes behave. Given that, we are in a position to share those general observations.

I could easily list what I see as the Top 10 reasons RCA efforts don’t last, but I’ll spare you the additional reading and focus on what I see as the Top 5.

Challenges in RCA

1. Lack of a True RCA Champion

The best Root Cause Analysis efforts I have seen revolve around a true Champion who walks the talk. These people understand what their RCA analysts must endure to be effective. They provide analysts with the tools, training, support, and expectations they deserve to do the analyses properly.

Unfortunately, these people are rare. They are in these jobs because they want to be, not because they have to be. The problem companies often have is that when they do have such a rare individual, they most certainly will be on the corporate fast track. They will likely be in that position for 1 to 3 years, and then they move on up.

A successful RCA effort will be institutionalized, meaning it will be ingrained in how we do business and survive the loss of the Champion and Leadership turnover. This should be a key element of designing an effective RCA. To me, it is the leading reason why RCA efforts fail!

2. Lack of Analytical Breadth and Depth

In my company, when a bid comes out for ‘RCA,’ and the requirements are generic (non-specific to what they want), they are just looking for the lowest bidder. Root Cause Analysis is a commodity to such people, and the specific RCA approach itself has no value. All approaches are considered equal when RCA is viewed as a commodity and not valued.

Not all RCA methods are equal—treating them as interchangeable is a fatal mistake.

So brainstorming, 5-Whys, Fishbones, and evidence-based causal tree approaches are the same in weight. That simply is NOT TRUE. Effective RCAs require appropriate breadth and depth. In looking at breadth, consider the difference in asking “How Can” vs “Why”.

“How Can” will explore all possibilities instead of just the single obvious observation. Where depth is concerned, stopping at the physics of failure and replacing parts will not necessarily prevent the next failure. A true RCA will drill down and discover inappropriate decisions made and WHY?

If a bearing fails due to fatigue, and we just replace it, is the problem solved? No, where did the fatigue come from? If we find someone misaligned the pump, causing the fatigue, and we discipline them, does the problem go away?

No, we need to understand why that person felt aligning the way they did was appropriate. We may find they didn’t know how to align properly, the procedures were obsolete or non-existent, the tools they had were inadequate, or they were simply time pressured and took shortcuts. If we go to this depth and solve the systemic problems, we will work on preventing recurrence.

3. Lack Adequate Evidence

When anyone is time pressured to do anything, they will often take shortcuts. The most time-consuming task in an effective Root Cause Analysis is the collection of evidence to prove our hypotheses. When we are time-pressured in RCA, that is where we tend to take shortcuts.

This is like a detective saying they don’t need evidence from the crime scene to make their case. It just doesn’t work like that. No one goes to court with hearsay and tries to make it fly as fact. An effective RCA effort will require the proper degree of evidence to support its hypotheses.

4. Strive Only to Meet Minimum Requirements (Regulatory and Procedural)

Unfortunately, a compliant Root Cause Analysis effort does not guarantee any improvement in Reliability or Safety. I see many RCA efforts in the field where success is defined as being compliant. Typically, this is in highly regulated industries like high-hazard and hospital settings.

A compliant RCA doesn’t guarantee reliability or safety—only real improvements do.

Almost all 6,000 hospitals in the US are accredited, meaning they pass the regulatory audit and will receive federal Medicare and Medicaid monies. This accreditation includes their RCA efforts. However, deaths due to medical error are consistently among the Top five killers of all Americans.

This demonstrates a disconnect between a compliant RCA system and actual patient safety. An effective RCA effort will measure success based on actual improvements in the process via bottom-line metrics (financial, safety, environmental, quality, and leading metrics focused on adherence to the proper steps of a true RCA [i.e., adequacy of evidence collected to support hypotheses]).

5. Lack of Understanding of Social Sciences

A big missing link to the traditional application of ‘RCA’ is understanding why good people often make the wrong decisions at the time they do. This gets into the field of the Social Sciences. I myself did not have a great enough appreciation for this field until I started researching the correlations between Reliability and Safety.

As RCA analysts, we must better understand human reasoning and the impact of organizational systems on decision-making. Engineers typically shine when delving into the physics of a failure. They are often lost when they delve into understanding the ‘soft’ stuff, like human reasoning.

Conversely, social scientists shine in understanding decision-making and intent but are lost when it comes to understanding the physics of a failure.

BONUS REASON: Working on What’s Urgent, Not What’s Important.

It’s hard to pick the proper priority from this list, as I feel they are all equally important. I find that a formal ‘RCA’ is usually not conducted unless a corporate/site trigger has been met. Such triggers will be thresholds based on production losses, equipment damage, injuries/fatalities, and environmental excursions, to name a few.

However, that is too late from an ‘RCA’ purist’s standpoint. That is a reactive use of RCA, which is typical. Ideally, we’d like to be able to apply RCA before these catastrophic events so we can avoid the risk of them. How can we do that? We can apply the concepts of effective RCA to chronic failures (the ones that do not rise to the levels of our triggers), high-severity near-misses, and unacceptable risks from risk assessments like FMEAs.

While this is my experience, I’d be very interested in hearing from you about places you have worked (or currently work) and what prevents their Root Cause Analysis effort from realizing its potential. Please feel free to email me at blatino@prelical.com.

Author

Bob Latino

Principal of Prelical Solutions, LLC and former CEO of Reliability Center, Inc. (RCI), Bob has 38+ years of global experience in Root Cause Analysis (RCA). He’s trained over 10,000 professionals in 25+ countries and co-authored ten books on RCA, FMEA, and Reliability. Bob serves on the Board of the Community of Human and Organizational Learning (CHOLearning), and is Series Editor for CRC Press’s “Reliability, Maintenance, and Safety Engineering” series.
View all posts

SHARE

Recent Posts

You May Also Like

How Rare Earths Became a Hidden Risk to Reliability Programs

How Rare Earths Became a Hidden Risk to Reliability Programs

Rare earth elements (REEs) are the invisible enablers of modern industry. These obscure-sounding materials—names like...

Beyond the Bar Chart: How to Truly Understand and Improve Your Maintenance KPIs

Beyond the Bar Chart: How to Truly Understand and Improve Your Maintenance KPIs

As maintenance and reliability professionals, we are driven by data. We track Key Performance Indicators (KPIs) to...

How to Empower Operators to Evaluate Abnormal Machinery Conditions

How to Empower Operators to Evaluate Abnormal Machinery Conditions

Process machines are critical to the profitability of processes. Safe, efficient, and reliable machines are essential...

Solving Lubrication Degradation Starts with Asking the Right Why

Solving Lubrication Degradation Starts with Asking the Right Why

Can Oil Fail? Within the industry, there has always been a great debate: it is not the oil that fails, but rather the...

Why Reliability Engineering That Ignores Context Will Fail Fast

Why Reliability Engineering That Ignores Context Will Fail Fast

Reliability is a subordinate topic to industrial and manufacturing engineering that has been branching out into its...

Why Shop Floor Training Fails – And How to Build Precision Instead

Why Shop Floor Training Fails – And How to Build Precision Instead

After more than three decades in industrial engineering and asset management, and having trained or overseen the...