Once upon a time, the industry understood there would be a massive knowledge transition as the Baby Boomers retired en masse, with the peak of the curve hitting shortly after 2016. There was tremendous work starting as far back as the late 1990s, with the concept of documentation and knowledge transfer balanced with an understanding of the basics. One of the largest areas for concern was skilled trades.
The Workforce Shift: A Lost Generation of Reliability Knowledge
Professional societies and organizations worldwide worked on a variety of programs promoting the trades with an emphasis on reliability and maintenance. In 2018, the Society for Maintenance and Reliability Professionals, following collaborative work on the Hill and White House visits, got Raja Krishnamoorthi’s “Strengthening Career and Technical Education for the 21st Century Act,” which reauthorized the Carl D Perkins Career and Technical Education Act, across the line.
Reliability knowledge isn’t lost in a single retirement; it’s lost one undocumented lesson at a time.
This provided funding for trade schools and other education initiatives, such as cybersecurity, and was unanimously approved by both the House and the Senate. SMRP also supported the use of funds for cybersecurity training and pushed in Congress for some of the funding to go toward trade schools where the tradespeople installing technology would need a fundamental understanding of connected systems.
The kingdom of reliability and maintenance was becoming more complex as new sensors entered the market and the term ‘data science’ emerged in 2015. Concepts such as ‘digital twin’ were revisited after earlier, scattered attempts in the 1990s with VRML (virtual reality) and the much larger, more complex computer systems used.
With the birth of the multi-core chip in 2007 for personal computers and laptops, these concepts were leaking back into business areas and slowly into reliability and maintenance. During this time, software companies realized they could ‘rent’ software rather than have their customers make an outright purchase and own it.
Software as a Service (SaaS), in which the software vendor could turn access on or off, was born. This impacted everything from technology to critical software systems within R&M – you no longer owned critical systems, and there was a stronger push to maintain everything in the ether (the cloud), also known as ‘someone else’s computer.’
Technology’s Rise and the Illusion of Progress
In the meantime, technicians were still greasing bearings, still running vibration routes, still performing other maintenance tasks. However, companies were told they could allow older employees to transition out before onboarding their replacements, eliminating any transition.
This was even noted at an automotive company, where at one plant there were over 200 tradespeople with only two apprentices, and the youngest tradesperson had over 30 years with the company and a retirement option at 25 years by 2008. Layoffs and workers past retirement age left the industry from 2008 to 2010, with a repeat during the 2020 pandemic.
At this time, there was a heavy push for a ‘return to basics’ approach within the reliability and maintenance population to get increasing failures in more and more complex systems under control. Most of the push was to ensure that the new generation entering the physical asset management system would understand the why and how of maintaining equipment reliability.
2020 was devastating to the reliability and maintenance community in some ways and a benefit in others. It had the option of being a reset, with returning to ramping up facilities and getting equipment returned to service, a way to set baselines, and teaching the incoming workforce the intricacies of the systems they were responsible for.
However, a series of articles in the financial industry suggested solutions for remote monitoring of equipment, coupled with academic articles since 2014 on ‘self-healing systems’ that implied that wear items such as bearings could ‘heal.’ Consulting firms such as Deloitte and McKinsey saw this as a shift to push early fault detection through this sensor windfall and the massive claimed improvements as a method to ‘streamline’ (reduce) reliability and maintenance headcount.
Technology can amplify maintenance, but it can’t replace the discipline that makes it work.
Existing and startup sensor and remote monitoring companies suddenly found astronomically high investment offers based upon reports such as ‘Global Predictive Maintenance Market Report 2021.’
This, like other reports, cited market values ranging from a few tens of $Millions to $4.5 billion in 2020, projected to over $50 billion (i.e.,>$64 billion in this report) with an extreme CAGR (Compound Annual Growth Rate) such as 31%. Interestingly, the only reports showing movement in this direction are by the consulting firms.
When AI Promises Outrun Practical Reliability
The fallback is that ‘AI systems are still maturing’ in the PdM space. Most are pushing the adoption of technology ahead of getting systems and information under control. There are even a few CMMS companies that advertise the use of AI to ‘fill in,’ or synthesize missing or poor-quality data to enable predictive analytics.
These are not minor organizations making his claim to use Generative AI to generate insights from incomplete logs and sensor data. One claims: “Generative AI helps fill gaps in maintenance records and sensor logs, allowing predictive models to function even when real-world data is incomplete.” (I am not linking this in the article as I believe it is dangerous).
Before November 2022, most people thought the term AI was strictly in the realm of science fiction. ChatGPT 3.5 was released to the world and generated excitement among users who received carefully engineered responses designed to echo the question being asked.
What amounted to a complex chatbot—what an LLM is designed to do—despite massive false positives and negatives, hallucinations, and the danger of harm —revolutionized the view of machine learning methods among most people.
This has spurred massive competition among a small handful of high-tech companies, including OpenAI, Amazon (AWS), Microsoft, IBM, and Xai (Tesla), among others. This growth has data centers that would normally be energy-intensive, ranging from 10 MW to a few hundred MW to 500 MW to 1.5 GW, with buildings far outstripping the ability of energy production to keep up, let alone the availability of clean water.
The result has been increases in electrical energy costs in multiple regions, with the East Coast through areas of the Midwest seeing price increases averaging 10% between 2024 and 2025 and 30% since 2020 to 2025 (Forbes). Data centers are on a path to consume up to 12% of the electrical energy produced by 2028 and, with present energy mixes, at least 185 Million Metric Tons of CO2 per year.
To put this into perspective, the primary metals industry (steel and iron) accounts for less than 2%, aluminum accounts for ~1%, petroleum refining accounts for 1.5%, and all other industries account for lower values.
Automotive manufacturing accounts for <0.4%, and all water and wastewater production and treatment combined account for <2% of electrical energy. This makes data centers the most energy-intensive industry in the World, with expectations that they will double by 2040 if the current trajectory continues.
Every new data center promises intelligence, yet consumes the energy of entire cities and small countries.
It is the number one issue being investigated across the energy, grid, and related industries, as datacenter demand is, by policy, prioritized over residential resources while the general population is having to foot the bill for infrastructure and production.
Lessons from the AWS Outage: Dependency and Data Vulnerability
The combination of skilled workforce needs and the promise of workforce reduction through automation/sensors, coupled with the fantasy of AI developing solutions without meaningful data, has inspired massive investment in reliability and maintenance startups.
While IEC and IEEE standards exist regarding governance and ethics, most are being bypassed through ‘collaborative’ efforts by the high-tech industry to govern and set separate standards developed by themselves through policy and legislation.
The lack of adherence to industry standards contributed to the October 19, 2025, internet outage caused by AWS when a fault in their DNS (address book) software affected the entire world – public and private – and resulted in hundreds of billions ($USD) in losses.
In effect, everyone learned how dependent their systems were on a handful of high-tech systems in which most ‘AI’ and database systems are tied into them, and the lack of redundancy. Basically, it was bound to happen, and it could have been much worse. As put by experts in the field, if this were an earthquake, it would have been only a 4.5. Much larger ones are coming.
The other thing that the AWS outage identified was how dependent cloud-based analytic systems are on these services. For at least 24 hours, the ability to monitor systems by different companies (also both public and private) ceased, in addition to some production systems and related records.
While we use these systems, and I’ve paid attention, we also noted that historical database records were corrupted on return to service. Luckily, my paranoia had our local systems immediately write over the lost data, in addition to our alert system being ‘edge’ in which alerts and alarms did not hesitate during this period. Others were apparently not as lucky.
Reclaiming Reliability: A Call to Return to CBM Fundamentals
The challenges we are presently running into as the AI/sensor hype pushes through towards the inevitable common sense, where we utilize the systems correctly and in collaboration with human expertise, are not insignificant. For instance, while companies are sold on cost reductions, most are experiencing increases due to false positives.
This hurts the reputations of R&M and condition-based maintenance, as some are experiencing, in good conditions, false positives and negative rates combined of 25%, while others are seeing, in poor conditions, false positives and negative rates combined of more than 80%. These are from independent studies, rather than industry studies that claim true positives over 90%.
Where Reliability Goes from Here
The R&M profession is at a crossroads. Demographic change has created skills and knowledge gaps in manufacturing and maintenance, with analysts projecting retirements and layoffs, leaving over 2.1 million US manufacturing jobs unfilled by 2030.
Surveys show that only 18% of retiring workers feel they have fully shared their knowledge, while 57% share less than half. Preserving institutional expertise through structured mentoring and documentation is therefore urgent.
At the same time, technology is advancing rapidly. New sensors, digital twins, and generative AI claim to create synthetic (modeled) sensor logs to fill in missing variables and simulate rare failure modes, promising to enhance predictive maintenance.
Predictive tools reveal patterns, but only skilled people turn patterns into prevention.
Yet technology should not be viewed as a panacea. Generative AI models are designed to produce plausible-sounding outputs rather than true data, and when they hallucinate, they can confidently present false guidance and damage trust. AI should therefore complement, not replace, human expertise.
A ‘back to basics’ approach – emphasizing proper lubrication, condition-based monitoring, and preventive maintenance – ensures that new tools rest on a stable foundation.
Organizations that pair sound maintenance practices with ethical, carefully validated AI deployments and robust knowledge-transfer programs will be best positioned to succeed in the upcoming years. This includes ensuring that internal data, records, and processes are accurate and up to date.









