IoT network outage?! C’mon, man!
So you’ve deployed your first (or second) IoT system. Maybe it gives you production performance, OEE or maintenance alerts.
Way cool! You’ve harnessed the power of device to cloud information, and you are on the road to Malibu, Ferrari’s and sunglasses.
Data loss? What do you mean, data loss?!
What’s this text message… Your CEO wants to know why there are no production numbers from Plant 1 yesterday. Um, let’s see what the cloud says… hmm. no data from yesterday, let’s call the local IT guy.
“Hi Jason, how goes it?”.
“Great Dave, except that PG&E contractor hit our fiber line yesterday with a backhoe.”.
Crap.
Call to CEO: “Hi Frank, this is Dave. Yeah, the fiber to Plant 1 was cut, so the data didn’t make it to the cloud. What? When will the data be updated?
Actually Frank, it won’t.”
(pause)
“Well, this kind of technology doesn’t have a “backup”, that’s why it’s low cost. No, we can’t regenerate the missing data. Um, yes, that is inconvenient… What happens if the connection stays down for 3 more days? Well… then we won’t have performance data for all four days.
Yes, I think the information is important. Yes, I agree… I will see what I can do going forward.”
Sigh.
Low cost data collection, high value analytics. Works awesome unless the infrastructure fails. So should we go back to onsite historians and silo’d reporting? That’ll just kill the advantage of on demand, anywhere analytics. So how do you avoid the high cost of IoT Network Outages?
How to avoid the high cost of IoT Network Outages
As you probably figured, I have an answer or two for your perusal. First, we have to categorize the type of information and it’s value. Is it:
- Business Critical
- Business Helpful
- Nice to know
Business critical: does it impact real-time decision making, and do those decisions impact performance or cost?
Business helpful: can I impact performance or cost by getting the information in a relatively timely fashion?
Nice to know: no impact on performance or cost – basically a historical view.
You’ll have to make the decision on which category your info fits into; I would wager it’s 1 or 2, because why would you spend the time, effort and dollars to deploy an IIoT system for a “nice to know” scenario?
Everything forward of here is based on Business Critical or Helpful as the assumption, meaning we can impact performance or cost by using the IoT provided info. On the other hand, if your system is in the “nice to know” category, read on if you plan to deploy business critical systems in the future.
So, can you afford data loss?
I say no. Especially not when you can prevent data loss with currently available technology. Production performance information, in and of itself, is a valuable commodity. Overall output, output by shift, output by date, production cost, labor cost, rework cost, all of those things can be impacted positively if real-time performance information is made available.
How to avoid IoT Data Loss
Start by looking for IIoT solutions that are available as a service. You want a robust architecture, a service level agreement that is in the five 9’s (99.999% up), and the ability to scale rapidly. If you are building it yourself, you’re already behind the curve, as you AREN’T a software development outfit. Unless you have the methodologies and the desire to dump 30 man years of development into it, just stop.
Next, make double damn sure your provider supports local Store and Forward on their edge data collector. This is the killer app that enables a zero loss scenario.
Most IIoT and IoT edge devices are designed to be low power, wake up, take a reading and send info. This is a true fire and forget method, and here’s how it works.
- Edge device wakes up (typically 30 seconds to 60 minutes)
- Edge device polls attached sensors
- Edge device sends values to the cloud (zero confirmation of receipt)
- Edge device goes to back to sleep
Interestingly, even a network slowdown can impact delivery from this kind of scenario. In addition, this method of data collection only works in non-real-time or slowly moving systems. For food processors, you might be able to use this methodology on a pasteurization process, but you sure can’t use it on a packaging line!
The Top 3 Ways to Survive IoT Network Outages
- Make sure your edge device can do event based data capture
- Make sure your edge device does Store and Forward data capture
- Make sure your edge device has health and connection monitoring
Event based data capture.
In batch, widgets and discrete manufacturing, almost everything has a start/stop. Example: shift change, a SKU change or a raw material change. Additionally, there are typically a number of units produced and boxed. All of these are event based, NOT time based! So your edge device needs to support event based data collection AND time based data collection.
Local Store and Forward.
Most edge devices don’t do anything except act as a communications gateway. The grab the sensor data and fire it off to the cloud, with no ability to determine success or failure. For me, this is a non-starter. The data is critical, because it impacts the post-analytics info. If you can’t ensure the data validity, you can’t make decisions on the resulting info.
Store and Forward is the answer to this conundrum. It requires your edge data collector to have a local database, a data synchronization engine and time measurement/sync capabilities. It also requires the device to have a cloud configurable data capture map. Here’s how it works:
- The edge device captures data as instructed by the data map and stores it in the local database.
- The data sync engine contacts the cloud, sends the data, confirms receipt, and executes error checking
- The cloud integrates the data into the right time slots and lets the local device know everything is complete
- The local edge device then deletes local data that is confirmed received and good by the cloud.
This creates an ever rolling buffer that overcomes momentary and extended network and internet issues, and ensures the integrity of your analytics.
Health and connection monitoring.
The edge device should have a heartbeat shared between the cloud and device that always informs the cloud analytics. The cloud should know within seconds or minutes that the device is not connected. In addition, it should know that the edge data collector has a health issue well before it becomes an operational issue.
There are other things that go into a successful IIoT system as well, but the above level of service will recover from both network slowdowns and outages that last for hours or even days.
The Conclusion.
Do yourself a favor and review your current and future system needs in the light of how the info will impact your users and business. It would really suck to deploy a system that is “broken” before it’s even being used.
The relative value of the information, and the impact of it going missing, should determine your solution selection.