I co-authored an entire chapter on the reliability engineer position description in Rules of Thumb for Maintenance and Reliability Engineers. That was back in 2008. Nearly two decades later, most plants still get this role wrong.
They hire a reliability engineer, hand them a desk, point them at the CMMS, and expect miracles. Six months in, the RE is buried in data entry, chasing work orders, and troubleshooting the same recurring failures with no authority to fix the root causes. The position becomes a fancy title for a reactive maintenance support role.
That’s a waste of talent and a missed opportunity.
What the Reliability Engineer Should Be Doing
The reliability engineer’s job is to systematically eliminate failures. Period. Everything in the role description should tie back to that single objective.

Here’s what that looks like in practice:
- Conducting failure modes and effects analysis (FMEA) on critical assets to identify and prioritize failure modes.
- Leading root cause failure analysis (RCFA) on significant or repetitive failures.
- Developing and optimizing PM/PdM strategies based on actual failure data and equipment criticality.
- Analyzing MTBF trends to identify bad actors and drive improvement initiatives.
- Partnering with operations to implement defect elimination programs.
- Managing the reliability-centered maintenance (RCM) process for the facility.
Notice what’s absent from that list: data entry, parts chasing, scheduling, and daily firefighting.
The reliability engineer’s job is to systematically eliminate failures. Period. Everything in the role description should tie back to that single objective.
How the Role Gets Hijacked
Problem 1: No Protected Time
The RE reports to the maintenance manager. The maintenance manager is fighting fires every day. When three pieces of critical equipment go down before lunch, guess who gets pulled off their FMEA to go troubleshoot? The reliability engineer.
Once this pattern starts, it never stops. The RE becomes a senior troubleshooter with an engineering title. The actual reliability work (the analysis, the strategy development, the long-term failure elimination) never gets done because there’s always a more urgent problem today.
Problem 2: No Authority to Implement
A reliability engineer can identify every failure mode, determine every root cause, and develop the perfect corrective action plan. If they don’t have the authority (or the organizational backing) to implement those changes, the work sits in a report that nobody reads.
I’ve seen this play out dozens of times. The RE presents a solid business case for upgrading a chronic pump failure. Maintenance management agrees. Operations says they can’t give up the downtime. Engineering says it’s not in the capital budget. And the pump keeps failing every six weeks.
Problem 3: Drowning in Data Entry
Some plants use the reliability engineer as the person responsible for cleaning up CMMS data, entering failure codes, building asset hierarchies, and generating reports for management. Those tasks matter. But they aren’t reliability engineering. They’re data administration.
If your RE is spending 60% of their time on CMMS housekeeping, you’ve hired an expensive data clerk.
If your RE is spending 60% of their time on CMMS housekeeping, you’ve hired an expensive data clerk.
How to Structure the Role for Success
Based on what I’ve seen work (and fail) across hundreds of facilities, here are the elements that separate effective reliability engineering programs from ones that just check a box on the org chart.
|
Element |
What It Means in Practice |
|
Reporting Structure |
RE reports to plant manager or engineering manager, not directly to the maintenance supervisor. This separation protects the role from being consumed by daily reactive work. |
|
Protected Time |
Minimum 70% of the RE’s time is allocated to proactive reliability work (FMEA, RCFA, PM optimization, bad actor analysis). This is non-negotiable and leadership must enforce it. |
|
Implementation Authority |
RE has the organizational backing to implement corrective actions through the planning and scheduling process. Their recommendations carry weight with both maintenance and operations. |
|
KPI Ownership |
RE owns and reports on MTBF improvement, bad actor reduction, PM effectiveness, and reliability-related cost avoidance. These metrics are reviewed monthly with plant leadership. |
|
Defined Scope |
Clear boundaries on what falls inside the RE role and what doesn’t. Data entry, scheduling, and reactive troubleshooting are explicitly excluded unless the RE chooses to engage. |
The KPIs That Tell You If It’s Working
You’ll know the reliability engineer role is functioning properly when you can point to measurable improvements in a few key areas:
- MTBF trending upward on your top critical assets over a 6 to 12 month period.
- Bad actor list shrinking. The top 20% of failure-producing assets should be rotating as problems get resolved and new ones surface.
- PM task list getting leaner and more effective. Unnecessary PMs eliminated, condition-based tasks replacing intrusive ones, and PM compliance correlated with actual reliability improvement.
- Root cause actions being implemented, not just documented. Track the percentage of RCFA recommendations that result in completed corrective actions within 90 days.
If the RE has been in the role for 12 months and you can’t point to specific assets where reliability improved because of their work, something is wrong with how the role is structured.
It’s a Leadership Problem
The reliability engineer role fails most often because of leadership, not because of the person in the seat. Plants hire competent engineers and then put them in a structure that makes success impossible.
If you want reliability engineering to work, you have to protect the role. You have to give the RE authority. You have to hold operations and maintenance jointly accountable for implementing their recommendations. And you have to stop pulling them into the daily firefight every time something breaks.
That takes discipline from the top. The same discipline we talk about with planning and scheduling, with PM programs, with storeroom management. It all comes back to leadership commitment to doing the hard work of proactive maintenance.
The reliability engineer role fails most often because of leadership, not because of the person in the seat.
Get the structure right, protect the role, and let your reliability engineer do what you hired them to do: eliminate failures, one root cause at a time.









