“It’s Within Spec” — The Most Dangerous Phrase in Maintenance
A glowing red bearing clearly in thermal distress, and the technician shrugs: “Says right here it’s still within limits.” Sound familiar?
This scene—captured with brutal accuracy in the cartoon—is not a joke to anyone who’s lived through real-world equipment failures. It highlights a chronic issue in industrial environments: the tendency to defer to specification sheets even when the visual, audible, or thermal signs of impending failure are unmistakable.
The phrase “within spec” equipment failures describes this phenomenon perfectly. The equipment is technically compliant, but still failing. Why? Because specs aren’t designed to optimize performance—they’re meant to define allowable margins. Margins that are too wide. Margins that ignore synergy. Margins that cost you uptime.
If you’re relying on spec compliance as your primary safeguard, you’re flying blind.
How “Within Spec” Equipment Failures Happen
Here’s the core issue: most engineering specs are built to accommodate warranty coverage and product liability—not maximum asset reliability.
Consider bearing temperatures. An OEM may state that 200°F is acceptable. But prolonged exposure to 180°F dramatically reduces grease life and accelerates failure modes like oxidation and lubricant bleed-off. It’s “within spec,” but it’s burning your margin down.
Or take oil cleanliness. Gearboxes with large sump volumes often have no filtration and rely solely on settling. OEM spec might allow ISO 22/20/17. That’s fine—for five years. Then the sludge, varnish, and wear particles tell a different story.
Vibration is another. You’re “within spec” on overall velocity, but an FFT spectrum reveals rising amplitude sidebands indicative of early-stage bearing fatigue. The damage is underway. And your trend data won’t show it—because your trigger point is too far downstream.
The root of within spec equipment failures lies in this gap between engineering tolerance and reliability precision. If you’re not operating in a tight corridor of control—what some call “reliability spec”—you’re at risk, even if your data says you’re inbounds.
Tighten Tolerances or Accept Chronic Failures
This is where proactive organizations separate from the reactive pack. They understand that “spec” is the floor, not the ceiling. So they redefine their own internal limits based on failure history, empirical data, and a relentless pursuit of uptime.
Examples:
- Lubricant cleanliness: If OEM allows ISO 20/18/15, you adopt ISO 16/14/11 for gearboxes and ISO 14/12/10 for hydraulics.
- Bearing temps: You set an upper control limit of 160°F, knowing every 18°F above that halves lubricant life.
- Alignment: You abandon straight-edge checks for laser alignment with tolerances 3x tighter than OEM guides.
- Vibration limits: You use filtered, narrow-band spectrum analysis to catch defects long before they trip standard alerts.
This is how world-class operations build asset reliability into daily operations: they compress tolerances, reduce variance, and question specs that fail to produce durable outcomes.
Failure isn’t a mystery when the conditions that led to it were always “within spec.” That’s not an alibi—it’s an indictment.
Rethinking KPIs and the Role of Specifications
So why do organizations cling to specs as performance indicators? Because they’re easy to audit, document, and defend. But that’s not the same as optimizing.
Modern reliability programs need to move from spec-based compliance to performance-based vigilance. It starts with questioning assumptions:
- Are your spec limits driving uptime or just documenting avoidance of failure?
- Do your PM and PdM programs measure variance from optimal, not just threshold exceedance?
- Have your frontline teams been empowered to override specs when real-world signals contradict them?
Metrics need to reflect this shift too:
- Replace “within spec” reports with control charts showing statistical variation around ideal ranges.
- Trend lubricant life in relation to temperature and cleanliness, not just hours.
- Use failure data to adjust alarm limits and redefine acceptable operating windows.
Specs should support reliability goals—not conflict with them. If your KPIs show everything’s green, but assets are still failing, you’re measuring the wrong things.
Conclusion: From “Spec Compliant” to “Reliability Driven”
Let’s be clear: specifications have value. But they’re not sacred. When within spec equipment failures become a recurring theme, it’s time to challenge those specs.
The best plants don’t rely on paperwork—they rely on outcomes. They train their teams to recognize when the spec is wrong, when the machine is talking, and when the data is whispering early warnings that generic thresholds won’t catch.
Don’t settle for “within limits.” Set your own. Tighter. Smarter. Data-driven. Because in reliability, the spec isn’t the standard—performance is.
Want to operationalize this mindset?
- Run a spec audit across your plant: What limits are you following? Do they reflect actual failure modes?
- Define “optimal” for your top 10 critical assets—temperature, vibration, cleanliness, etc.
- Build dashboards that highlight deviations from ideal, not just from maximum.









