You’re going to need a bigger boat!

boat3

With the recent 42nd anniversary of ‘Jaws’ the film, everyone always comes up with the tagline above ‘you’re going to need a bigger boat’ and that led me recently to consider where we have come from regarding engineering, calculations (to solve problems), data, data science and now to Big Data, the Internet of Things (IoT) and its Industrial counterpart and the role of the engineer and the data scientist.

Back in my day, not quite 42 years ago (but close), we did calculations by hand, on paper, with a calculator, often requiring engineering ‘iterations’ to get the right design flow (or whatever). Yes, we made mistakes (probably too many to worry about!) – then we had spreadsheets (to speed up iterations with less mistakes), then static process simulations to speed up the design and test alternatives, then dynamic process simulations to allow us to ‘predict’ the performance of the plant so as to help optimize the design – from there integrating with controls systems to develop control room operator training and from there to 3D Virtual Reality headsets that allow the User to ‘walk’ around the plant integrating control room and field operations etc….

And so into predictive analytics, where engineering meets data science…Predictive analytics has been around for quite some time and the ‘engineering’ approach has been to seek the answer to the question that everyone asks after there has been some kind of process or equipment failure, outage or worse still some kind of accident ‘Why couldn’t we see this problem coming?’ and usually, you can, but you have to be looking at all the variables, all the time and apply your engineering knowledge and skills to ‘spot the problem’ way in advance.

Clearly, you can apply lots of engineers and operators to look at all the data all of the time or you can use computer software to help. Early solutions involved looking at the ‘time series’ data only, as deviations from the ‘normal’ are relatively easy to spot and ‘trending’ solutions can alert and notify the host. But, as referring to my earlier engineering statements, users then want more – from spreadsheets to 3D VR – once you have your predictions based on time series analytics – then you want to add context, add CMMS and Maintenance data, operation logs, does the weather play a part, etc… and by adding each level of complexity, then the ‘problem’ to be solved gets much more complex and your ‘time series’ algorithm, even if it’s a really good one, just doesn’t stack up.

So you need to look at the problem from different aspects and try to find the best way to solve the ‘really big’ problem – which is – ‘How do I take all my relevant data – format it in a way that it can make sense and then contextualize it in a way that I can begin to make sense of it and then use all that data to ‘predict’ whats going to happen’. That’s a big problem that will need a big computer (‘gonna need a bigger computer’) and complex algorithm(s) to solve it.

Cloud computing has been around awhile, as has ETL (Extract Transform Load), and that is what impressed me so much to join Predikto, their ability to take all the data (as much as anyone would need or want), ETL it, put it in the Cloud, allow our ‘MAX’ Predictive engine to chew through it (just once, for 3-4 weeks), and then begin to apply every known algorithm that every Data Scientist might know (and a few of our own) AND, daily, optimize those algorithms for accuracy, applicability, and optimize for the types of features that help the predictions depending on the varying conditions of the process or asset (or weather etc) that are affecting it daily, hourly, etc….

Now that is really, REALLY clever stuff….and things that our customers have been asking for, for over 25 years in the business….

Big Data, IIoT, Industrie 4.0 and all the things that bring these together combined with what we are doing with Predikto – now that’s the future…..and I am honored to be part of Mario and Robert’s team – watch this space – stay on this track (sic), the future is in Predikto and is HERE!

By Paul Seccombe.

Paul joined Predikto in 2017 after his role as Solutions Leader at the GE Predix Oil & Gas for Europe and the Middle East. He was also at Smart Signal for many years. Paul holds a PhD in BioChemical Engineering from the University of Wales. He is based out of London.

The Missing Link in Why You’re Not Getting Value From Your Data Science

The Missing Link in Why You’re Not Getting Value From Your Data Science

by Robert Morris, Ph.D.

DECEMBER 28, 2016

Recently, Kalyan Veeramachaneni of MIT published an insightful monologue in the Harvard Business Review entitled “Why You’re Not Getting Value from Your Data Science. The author argued that bfbusinesses struggle to see value from machine learning/data science solutions because most machine learning experts tend not to build and design models around business value. Rather, machine learning models are built around nuanced tuning and subtle, yet complex, performance enhancements. Further, experts tend to make broad assumptions about the data that will be used in such models (e.g., consistent and clean data sources). With these arguments, I couldn’t agree more.

 

WHY IS THERE A MISSING LINK?

At Predikto, I have overseen many deployments of our automated predictive analytics software within many Industrial IoT (IIoT) verticals, including the Transportation industry. In many cases, our initial presence at a customer is in part due to limited short-term value gained from an internal (or consulting) human driven data science effort where the focus had been on just what Kalyan mentioned; a focus on the “model” rather than how to actually get business value from the results. Many companies aren’t seeing a return on their investment in human driven data science.

wallThere are many reasons why experts don’t cook business objectives into their analytics from the outset. This is largely due to a disjunction between academic expertise, habit, and operations management (not to mention the immense diversity of focus areas within the machine learning world, which is a separate topic altogether). This is particularly relevant for large industrial businesses striving to cut costs by preventing unplanned operational downtime. Unfortunately, the bulk of the effort in deploying machine learning solutions geared toward business value is that one of the most difficult aspects of this process is actually delivering and demonstrating value to customers.

WHAT IS THE MISSING LINK?

In the world of machine learning, over 80% of the work revolves around cleaning and preparing data for analysis, which comes before the sexy machine learning part (see this recent Forbes article for some survey results supporting this claim). The remaining 20% involves tuning and validating results from a machine learning model(s). Unfortunately, this calculation fails to account for the most important element of the process; extracting value from the model output.

In business, the goal is to gain value from predictive model accuracy (another subjective topic area worthy of its own dialog). We have found that this is the most difficult aspect of deploying predictive analytics for industrial equipment. In my experience, the breakdown of effort required from beginning (data prep) to end (demonstrating business value) is really more like:

40% Cleaning/Preparing the Data

10% Creating/Validating a well performing machine learning model/s

50% Demonstrating Business Value by operationalizing the output of the model

The latter 50% is something that is rarely discussed in machine learning conversations (with the aforementioned exception). Veeramachaneni is right. It makes a lot of sense to keep models simple if you can, cast a wide net to explore more problems, don’t assume you need all of the data, and automate as much as you can. Predikto is doing all of these things. But again, this is only half the battle. Once you have each of the above elements tackled, you still have to:

Provide an outlet for near-real-time performance auditing. In our market (heavy industry), customers want proof that the models work with their historical data, with their “not so perfect” data today, and with their data in the future. The right solution provides fully transparent and consistent access to detailed auditing data from top to bottom; from what data are used to how models are developed, and how the output is being used. This is not only about trust, but it’s about a continuous improvement process.

Provide an interface for users to tune output to fit operational needs and appetites. Tuning output (not the model) is everything. Users want to set their own thresholds for each output, respectively, and have the option to return to a previous setting on the fly, should operating conditions change. One person’s red-alert is not the same as another’s, and this all may be different tomorrow.

Provide a means for taking action from the model output (i.e., the predictions). Users of our predictive output are fleet managers and maintenance technicians. Even with highly precise, high coverage machine learning models, the first thing they all ask is What do I do with this information? They need an easy-to-use, configurable interface that allows them to take a prediction notification, originating from a predicted probability, to business action in a single click. For us, it is often the creation of an inspection work order in an effort to prevent a predicted equipment failure.

Predikto has learned by doing, and iterating. We understand how to get value from machine learning output, and it’s been a big challenge. This understanding led us to create the Predikto Enterprise Platform®, Predikto MAX® [patent pending], and the Predikto Maintain® user interface. We scale across many potential use cases automatically (regardless of the type of equipment), we test countless model specifications on the fly, we give some control to the customer in terms of interfacing with the predictive output, and we provide an outlet for them to take action from their predictions and show value.

As to the missing 50% discussed above, we tackle it directly with Predikto Maintain® and we believe this is why our customers are seeing value from our software.

pm1

Robert Morris, Ph.D. is Co-founder and Chief Science/Technology Officer at Predikto, Inc. (and former Associate Professor at University of Texas at Dallas).

Does you company have a Big Data and Analytics strategy plan?

McKinsey Quarterly published an article in March of 2013 called: “Big data: What’s your plan?“. In summary:

The payoff from deploying a big data and advanced analytics solution can deliver productivity and profit gains as high as 5 to 6 percent higher than competitors. In order to achieve this, companies need to develop a plan and strategy. A successful plan requires dialogue at the top of a company to establish investment priorities; to balance speed, cost, and acceptance; and to create the conditions for frontline engagement.

A successful plan will focus on:
1) Data: How to assemble and integrate data is essential since data is spread and silted across the organization and somethings in different systems (or organizations). Making the information useful and available is a required capability.

2) Analytic Models: Integrating data does not do much if you can use it to make decisions. Advanced analytic models enable data driven optimization or predictions. The plan needs to understand where analytical models will make the biggest impact to the organization.

3) Tools: The output from these complex analytical models can be complex. So, this output is only useful if employees and managers at the right level within the organization can perform actionable tasks on the information.

The big data and analytics plans and value will vary by company and industry. The data will come from many sources like customer, vendor, sales, maintenance, and physical objects (like meters).

McKinseyValue