From Condition Monitoring to Smart Decisions: The New Maintenance Era

by Joel Levitt | Articles, Maintenance and Reliability, Predictive Maintenance

Condition Monitoring in New Maintenance Era

It seems society is going somewhere. Consumer products and applications are blazing a trail with high-tech offerings like Siri and Alexa. They use big data, AI, cloud, and just about everything else. Ads are popping up about the cloud, AI (artificial intelligence, like IBM’s Watson), big data, and machine learning.

Maintenance Management is somewhat late to the party. But once at the party, the field is changing rapidly.

The goal of all the hi-tech gear, software, and prescriptive maintenance is getting to the best business decision, in real time, for the whole company- quickly.

In traditional maintenance departments, skilled tradespeople, engineers, and leaders make decisions based on their broad experience and limited data. In the future, they will be aided by wireless data collection from a wide variety of sensors, stored in massive data files, subject to myriad algorithms for analysis.

These analytical programs will help maintenance professionals in making optimal decisions. Eventually, the software will be issuing work orders to correct problems before any human realizes there is a problem.

The analysis helps the SME (Subject Matter Experts) decide whether and when to intervene, what action to take on a real-time basis based on:

The current condition of the asset and its resistance to failure
Failure probability and risk
Understanding of consequences of failure
The incoming data stream from the asset

What makes this process more powerful is that, given the correct inputs and the proper analysis, maintenance decisions will take into consideration all these conflicting requirements:

Production requirements, options, needs
Production window
Maintenance availability
Parts availability and lead times
Financial conditions, cash flow, profit, customer retention
Maintenance costs, maintenance capabilities

An important question is, how will all this affect you?

As CMMS, wired and wireless sensors, analytics, mobile order entry, and real-time data become available, decision-making will improve. In some cases, the maintenance advantage will be slight, but the operational advantage will be significant.

A few of the more straightforward maintenance improvement ideas include:

You can schedule PMs when they will impact the production cycle the least. That improves demand hours.
The shop schedule will be complete and more reliable with fewer disruptions
Catastrophic events will fall dramatically because some of those events broadcast themselves from the sensors (like oil, water problems).
Rebuilds will be better scheduled, and only the worn parts will be replaced.
The list can go on, but the idea is that better information in the hands of the people who can use it supports intelligent decision making.

Maintenance Uses for AI

A couple of specific maintenance uses:

Continuous readout of machine operating parameters will detect faults before they become catastrophes. The software will stay alert for “not to exceed” values and trends to alert the operator. Decisions can be made in real time in the real world, considering the product, time of day, and maintenance availability.
Energy consumption can be monitored and correlated with production, raw materials, and specific products. If the situation is degrading, the machine can be flagged for service (but not on an emergent basis).
With big data and programs that make decisions using rules, we can input the system to make decent decisions about taking an asset out of service for a comprehensive repair or service at a particular time.
The system can be programmed to detect hot dog or abusive operators.

Getting the correct information into the right hands at the right time

Much of this data is perishable. If it is not used, it will be useless. The ability to inform the maintenance team in real time is worth money in terms of schedule efficiency, reduced MTBF (fewer events), and reduced downtime. Imagine if an asset is being worked on and the system examines the data in detail and determines that, if additional work is done now, the asset will not have to be disturbed for an extended period. It would also be nice to know if we are within 10% of the next scheduled PM.

Let’s get something straight.

All this high-tech and Prescriptive Maintenance is only one (important) part of a bigger conversation! If the more significant discussion is Reliability, then maintenance is part of the solution, but only a part of it.

No matter what anyone says, “Prescriptive and hi-tech maintenance is not sufficient for world-class results!” At best, hi-tech solutions for maintenance are a part of maintenance, and maintenance is a part of reliability.
Prescriptive maintenance and hi-tech sensors are tools in the toolbox

We strive to be the best maintenance department we can be. We call that striving for world-class maintenance.

World-class maintenance is an amorphous concept, and it is not real, but it can have real benefits. Its primary benefit is that it can create an opening for action and can cause people and their organizations to stretch themselves and develop.

But if someone claims they can get you there – guaranteed, they will take you for a ride!

Block Diagram for High Tech Maintenance

Data sources

There are hundreds of possible data sources. Each of these sources is formatted (vibration readings) or unformatted (PM tech notes on machine health). Data collection can be manual (pictures), semi-automated (vibration route where you plug in the data collector for upload), or automated (wireless sensor) from inside systems (CMMS) or outside systems (weather reports).

Data (usually called records with individual fields) is collected every second (wireless sensors), Daily (rounds, production), weekly/monthly/less often (PM tickets), as generated (work orders), and other frequencies.

One primary data source is asset condition monitoring. The movement in this arena is from manual human inspection to instrument-based inspection to wireless sensor-based inspection.

Using the condition of the equipment to initiate a maintenance action has been called condition-based maintenance. Condition-based maintenance (CBM) looked for a reading, a differential reading, or a trend to trigger actions. Condition-based maintenance will likely be used differently in the future.

Sensors mounted on the equipment collect data cheaply (once you pay for the installation) and accurately. The other great advantage is that once installed, the readings can be gathered without intrusive activities. This non-intrusive approach reduces the chance of a maintenance-induced fault. Some of the hundreds of sensors include:

RFID
Motion
Cameras
Hi-speed capture
Temperature
- Duct temp
- Water temp
- RTD high, low
Vibration (accelerometers)
Sound level
Mechanical stress
Vibration
Dry contact
Open/closed
Water present
Voltage
- 0-5 V
- Voltmeter
- Volt detection
- Conductivity
Pressure
Spectrometer
Resistance
Amps
- Amp meter
- Current detection
Tilt
Impact
Movement
G force (snapshot)
Weather
Colorimeter
Turbidity
CO2
PH
Light

The list above is a partial list of available detection methods. Once data is collected, we can begin the analysis and focus on various aspects of managing the asset to maximize value to our organization. Our focus can be on low-cost, long-term unit output (cost per part/ton/mile), the most extended asset life, operating costs, product quality, safety, or lower acquisition costs.

Focus on what is essential.

The above condition-monitoring tools can detect a wide range of conditions, but only a few states will affect the asset’s usefulness. We choose only sensors that can detect the specific failure modes of the asset or process. Understanding the failure mode is key to using sensors effectively.

In this section, we are focusing on breakdowns, downtime, and lower maintenance costs. To look at breakdowns and avoid disruptions, we start with definitions of failure and failure mode.

Failure: The loss of a function under stated conditions.

Failure mode: The specific manner or way by which a failure occurs in terms of the failure of the item; it may generally describe the way the failure occurs. It shall at least clearly describe an (end) failure state of the item. It is the result of the failure mechanism (cause of the failure mode).

There is a good deal of knowledge and some creativity required to identify sensors suitable to detect some of the hidden or more obscure failure modes.

Storage

One technological breakthrough has been the ongoing reduction in storage costs. We can now store more data for more extended periods, including big data, at a price that makes it worthwhile.

All the data generated is stored in a (or several) database(s). When these databases are outside your plant, they are called cloud storage. If the databases are local, they are Fog storage (cloud that came down to the ground). The big players in cloud storage include Amazon, Microsoft, and IBM. The advantage is that with a cloud vendor, you can scale up to almost any degree without building or buying computer power (write a bigger check).

Data are values in some form that represent transactions, readings, and observations. Sophisticated databases can store images, handwriting, sound files, and, of course, anything alphanumeric.

For example, a fuel consumption sensor spits out gallons per hour, or gallons per mile, and that is data. GPS locations at different times of the day are data. The same applies to speed, engine temperature, and even work orders and customer load information. It is all data.

Big Data is stored in server farms like this one. Picture of a Facebook server farm in Luleå, Sweden. It is less than 70 miles south of the Arctic Circle. It uses cold outside air to cool the servers.

Different data sources

We are now collecting data from various sources. By calling what we are doing big data, we risk obscuring the fact that this data has always been there, but it is hard to access and correlate. With data from many sources available at once, we can combine them in new ways to get a more in-depth look at what is going on.

Big Data

Big data is all the data from all sources that you store in the cloud. While big data has been around for a long time, today’s big data is unique because it relates data of three types. These relationships provide context.

Big data is not new

First, a fact of life was apparent that all the hoopla is about the same information we’ve always had. The difference is what they call the 4 V’s of big data:

Volume: The best driver might scan their oil temperature once every couple of minutes. IIoT scans it every second.
Velocity: The data sources are working much quicker than ever before.
Variety: All the data has always been available. The difference is that it is all together in one place. Now we can correlate weather, cranking systems, or altitude changes (hilly country) with brake repairs.
Veracity: dealing with the uncertainty of the data by cross-checking it with other data and using advanced statistics to ferret out the truth.

Example of big data

A CAT mining Haul machine has 145 sensors. Each sensor reports its readings once a second to the brain on the truck (this is called edge computing, in which a local processor pre-processes the data stream). Data is aggregated by the processor on the car and sent up to the cloud. The data files start to get quite large. They are called “Big data” when you add in data streams from other sources. We could add in weather, incoming orders, traffic conditions, fuel prices along the route, and almost anything else that could impact our decisions.

Here are a few of the 145 sensors that are providing readings every second.

Engine Coolant Pump Outlet Pressure
Engine Coolant Temperature
Engine Coolant Pump Outlet Temperature
Engine Coolant Temperature
Engine Oil Level
Engine Oil Pressure
Engine Oil Temperature
Ground Speed
Fuel Consumption Rate
Fuel Filter Differential Pressure
Fuel Pressure
Fuel Rail Pressure
Fuel Rail Temperature
Fuel Temperature
Fuel/ Water Separator Level Status

Big data, cloud, or sensors are not new or particularly useful on their own.

All the work we’ve been doing leads us to sound conclusions from looking at this data and generating valuable ideas for running our machinery, buildings, or fleets. What we need are outputs that help us make better decisions. These useful outputs have been around since the dawn of computers (the first maintenance system was MIDEC in 1965, from Mobil Oil to tell you when to change the oil in your mobile equipment).

Analytics

How on earth do we make sense of all this data? How do we use it? Making sense of data is called Analytics. Analytics uses algorithms.

Algorithms

All these outputs use Algorithms. Algorithms can perform calculations, data processing, and automated reasoning tasks. UCLA’s John Villasenor: This means that even something as innocuous as a recipe or a list of directions to a friend’s house is an algorithm. These algorithms are thought processes or steps to solve problems.

The idea of algorithms is quite old. For example, the Euclidean algorithm (an efficient algorithm for computing the greatest common divisor of two numbers) from Euclid’s Elements, a mathematical treatise consisting of 13 books, was written in c. 300 BC.

There are thousands of types of algorithms that could be used to analyze maintenance data. Some of the most popular ones are

Decision trees (a series of decisions or gates)
Bayesian with increasingly accurate estimates
Linear regression
Ordinary Least Squares Regression
Clustering

Experts in haul trucks

Subject-matter experts (SMEs) develop algorithms by working closely with data scientists (who translate the steps into an understandable computer input). Anyone with some training could serve in either role. Assuming you are familiar with the equipment and have looked at the numbers from each sensor and their relationships with one another, what could you see? Could you know a lot about how the next few days or weeks will go? In fact,

We could write simple algorithms to tell us:

To change the fuel filter (Fuel Filter Differential Pressure)
Change oil (Engine Oil Level, Engine Oil Pressure, Engine Oil Temperature, mileage, viscosity)
Efficiency might be related (Ground Speed, Fuel Consumption Rate)

Or even more complex algorithms might give us insight into:

Operating parameters that optimize fuel use
Conditions that predetermine expensive failures, like premature engine or transmission failures
Specific conditions right before a major fault

Now imagine what algorithms an engine/haul machine SME could write to detect defects well before failure, or what conditions an adjustment could make that make a world of difference to efficiency, reliability, and ultimately tonnage.

Four types of analytics

Analytics- for our purposes, there are four flavors:

Descriptive: What is happening? Like: Number of work orders, incomplete PM%, Age of backlog by priority. Descriptive analytics is a typical kind of analytics widely used by CMMS users.
Diagnostic: Why is this happening? Like: Overtime is up because breakdowns are up; tire problems went up when we changed vendors. There is another type within diagnostics called assisted diagnostics, where the computer program helps the human think through the possibilities.
Predictive: What is likely to happen? We always wanted this! Like: The bearing will last until Thursday. In 2 hours, a fire will start in E-123, and we will be short 2 electricians next year. Predictive analytics is a fascinating area that was PdM’s unfulfilled capability.
Prescriptive: What is the best course of action? Change the darn bearing! Increase the stock level by four units to achieve a 97% service level

Look for the consequences of the risk, and let people participate in choosing the level of risk.

When developing detection algorithms or tests, a balance must be struck between risks of false negatives and false positives.
Usually, there is a threshold of how close a match to a given sample must be before the algorithm reports a match.
The higher this threshold, the more false negatives, and the fewer false positives.

Believe it or not, most firms already have analytics and don’t use them very well.

Basic analytics has been part of CMMS canned report options since the beginning. The reports are typically not well used or even understood. Way more could be done by maintenance departments wanting to be more data-driven by using them. No more investment needed!

Prescription: All the data generated by the machine is collected and sent to the cloud for storage. The cloud-based software performs the analytics and, ideally, gives you a conclusion on what to do and how soon to do it.

Is it a cognitive process (whether you use machines or humans)?
Involving symptoms analysis, data analysis, and health diagnosis
Based on many data sources
Consideration of all alternatives for treatment
There is always a recommended action to be taken, a prescription for action

AI

All this analysis is leading to artificial intelligence (AI)

Potential AI

AI is a series of Algorithms (sometimes quite a few) that review the data and find combinations that represent actionable situations. The data is used to develop maintenance actions. The actions would be like those an experienced maintenance person would come up with if they had the time, attention, and memory to look at the data.

Artificial – made or produced by human beings rather than occurring naturally, typically as a copy of something natural.
Intelligence – the ability to acquire and apply knowledge and skills
Machine learning, deep learning, intelligent agents, bots, neural networks, robots, and autonomous driving are all forms of AI or use AI to operate.

We interact with AI every day with Alexa and Siri. Even some communications with organizations are handled by intelligent agents. An increasing number of systems are passing the Turing test for computer intelligence.

AI is currently used to help fly aircraft, cut logs to optimize lumber yield, help doctors diagnose diseases, and even set up production soup cooking kettles. New applications are being added daily. In maintenance alone, there are 100s of companies with solutions for different situations.

A brief history of AI

1950 Alan Turing’s paper on the possibility that machines could think. The Turing test is if you chat with a computer and think it is a person, then the computer is thinking!
1955 Newell and Simon wrote a program that proved the first 38 theorems in Whitehead’s Principia Mathematica
1956, 1st conference at Dartmouth, called the Summer Research Project on Artificial Intelligence, was the launch of the field and was attended by the who’s who of the field.
1964 ELIZA, running an AI script known as Doctor, simulated a Rogerian psychotherapist. Many early users were convinced of ELIZA’s intelligence and understanding, despite Weizenbaum’s insistence to the contrary.
1965: The earliest examples of expert systems byEdward Feigenbaum and his students. Dendral, in 1965, identified compounds from spectrometer readings.
1972 MYCIN, an expert system, diagnosed infectious blood diseases.
1989 Deep Thought (forerunner of Deep Blue) started to defeat chess masters
1997 Deep Blue became the first computer chess-playing system to beat a reigning world chess champion, Garry Kasparov
In 2005, a Stanford robot won the DARPA Grand Challenge by driving autonomously for 131 miles along an unrehearsed desert trail
2011 in a Jeopardy! Quiz show, IBM’s Watson, defeated the two greatest Jeopardy! Champions, Brad Rutter and Ken Jennings
2011 Apple’s SIRI
2014 Amazon’s Alexa

Your biggest challenge is where to start!

Start with the business problem. What is the biggest and best solution that can provide value to the organization?

Things like

Detecting impending failure
Unsafe acts
Early warning of a change in operation
Anomaly detection
Abuse

AI is becoming an increasingly important part of innovative maintenance management. We must keep in mind one of W.E. Deming’s 14 points: buying a new gadget or software will not fundamentally change the business process. A bad process or a reactive culture will not be fixed with technology. It is the culture!

ALERT

You may already suffer from the single biggest problem. The problem is not using, designing, or implementing the tech, algorithms, cloud, fog, or sensors. The problem is discipline.

Answer the question: if a PM inspector or a PdM inspection uncovers a defect, deterioration, or damage, do you always plan, schedule, and execute the correction before the asset fails? Simple question.

If so, you will probably ignore the prescriptive maintenance work order (just like you do now –ignore the PdM findings) or you will defer the action until the unit fails all by itself. In that case, you have the worst of all worlds – fancy tools in a reactive environment.

This is the least safe and most expensive way to run a maintenance department. The prescription for that condition is to change your culture. Work to become a reliability leader in a learning organization of reliability leaders.

Author

Joel Levitt

Joel Levitt is a renowned trainer in the maintenance industry, having trained over 20,000 professionals from 3,000 organizations across 42+ countries. Since 1980, he has led Springfield Resources, a management consulting firm specializing in maintenance solutions. With 35 years of experience in various maintenance roles, including process control, field service, and maritime operations, Levitt is a frequent speaker at industry conferences and the author of 10 books and numerous articles on maintenance management. He has also served on several boards and committees and is an active member of AFE.
View all posts