The Missing Link in Why You’re Not Getting Value From Your Data Science

The Missing Link in Why You’re Not Getting Value From Your Data Science

by Robert Morris, Ph.D.

DECEMBER 28, 2016

Recently, Kalyan Veeramachaneni of MIT published an insightful monologue in the Harvard Business Review entitled “Why You’re Not Getting Value from Your Data Science. The author argued that bfbusinesses struggle to see value from machine learning/data science solutions because most machine learning experts tend not to build and design models around business value. Rather, machine learning models are built around nuanced tuning and subtle, yet complex, performance enhancements. Further, experts tend to make broad assumptions about the data that will be used in such models (e.g., consistent and clean data sources). With these arguments, I couldn’t agree more.

 

WHY IS THERE A MISSING LINK?

At Predikto, I have overseen many deployments of our automated predictive analytics software within many Industrial IoT (IIoT) verticals, including the Transportation industry. In many cases, our initial presence at a customer is in part due to limited short-term value gained from an internal (or consulting) human driven data science effort where the focus had been on just what Kalyan mentioned; a focus on the “model” rather than how to actually get business value from the results. Many companies aren’t seeing a return on their investment in human driven data science.

wallThere are many reasons why experts don’t cook business objectives into their analytics from the outset. This is largely due to a disjunction between academic expertise, habit, and operations management (not to mention the immense diversity of focus areas within the machine learning world, which is a separate topic altogether). This is particularly relevant for large industrial businesses striving to cut costs by preventing unplanned operational downtime. Unfortunately, the bulk of the effort in deploying machine learning solutions geared toward business value is that one of the most difficult aspects of this process is actually delivering and demonstrating value to customers.

WHAT IS THE MISSING LINK?

In the world of machine learning, over 80% of the work revolves around cleaning and preparing data for analysis, which comes before the sexy machine learning part (see this recent Forbes article for some survey results supporting this claim). The remaining 20% involves tuning and validating results from a machine learning model(s). Unfortunately, this calculation fails to account for the most important element of the process; extracting value from the model output.

In business, the goal is to gain value from predictive model accuracy (another subjective topic area worthy of its own dialog). We have found that this is the most difficult aspect of deploying predictive analytics for industrial equipment. In my experience, the breakdown of effort required from beginning (data prep) to end (demonstrating business value) is really more like:

40% Cleaning/Preparing the Data

10% Creating/Validating a well performing machine learning model/s

50% Demonstrating Business Value by operationalizing the output of the model

The latter 50% is something that is rarely discussed in machine learning conversations (with the aforementioned exception). Veeramachaneni is right. It makes a lot of sense to keep models simple if you can, cast a wide net to explore more problems, don’t assume you need all of the data, and automate as much as you can. Predikto is doing all of these things. But again, this is only half the battle. Once you have each of the above elements tackled, you still have to:

Provide an outlet for near-real-time performance auditing. In our market (heavy industry), customers want proof that the models work with their historical data, with their “not so perfect” data today, and with their data in the future. The right solution provides fully transparent and consistent access to detailed auditing data from top to bottom; from what data are used to how models are developed, and how the output is being used. This is not only about trust, but it’s about a continuous improvement process.

Provide an interface for users to tune output to fit operational needs and appetites. Tuning output (not the model) is everything. Users want to set their own thresholds for each output, respectively, and have the option to return to a previous setting on the fly, should operating conditions change. One person’s red-alert is not the same as another’s, and this all may be different tomorrow.

Provide a means for taking action from the model output (i.e., the predictions). Users of our predictive output are fleet managers and maintenance technicians. Even with highly precise, high coverage machine learning models, the first thing they all ask is What do I do with this information? They need an easy-to-use, configurable interface that allows them to take a prediction notification, originating from a predicted probability, to business action in a single click. For us, it is often the creation of an inspection work order in an effort to prevent a predicted equipment failure.

Predikto has learned by doing, and iterating. We understand how to get value from machine learning output, and it’s been a big challenge. This understanding led us to create the Predikto Enterprise Platform®, Predikto MAX® [patent pending], and the Predikto Maintain® user interface. We scale across many potential use cases automatically (regardless of the type of equipment), we test countless model specifications on the fly, we give some control to the customer in terms of interfacing with the predictive output, and we provide an outlet for them to take action from their predictions and show value.

As to the missing 50% discussed above, we tackle it directly with Predikto Maintain® and we believe this is why our customers are seeing value from our software.

pm1

Robert Morris, Ph.D. is Co-founder and Chief Science/Technology Officer at Predikto, Inc. (and former Associate Professor at University of Texas at Dallas).

Deploying Predictive Analytics (PdA) as an Operational Improvement Solution: A few things to consider

“…in data science…many decisions must be made and there’s a lot of room to be wrong…”

There are a good number of software companies out there who claim to have developed tools that can potentially deploy a PdA solution to enhance operational performance. Some of these packages appear to be okay, some claim that they are really good, and others seem really ambiguous other than being a tool that a data scientist might use to slice and dice data. What’s missing from most that claim they are more than an over glorified calculator are actual use cases that can demonstrate value. Without calling out any names, the one thing that these offerings share in common is the fact that they require services (i.e., consulting) on top of the software itself, which is a hidden cost, before they are operational. There is nothing inherently unique about any of these packages; all of the features they tout can be carried out via open-source software and some programming prowess, but here lies the challenge. Some so-called solutions bank on training potential users (i.e., servicing) for the long-term. These packages differ in their look-and-feel and their operation/programming language and most seem to either require consulting, servicing, or a data science team. In each of these cases, a data scientist must choose a platform/s, learn its language and/or interface, and then become an expert in the data at hand in order to be successful. In the real world, the problem lies in the fact that data tends to differ for each use case (oftentimes dramatically) and even after data sources have been ingested and modified so they are amenable predictive analytics, many decisions must be made and there’s a lot of room to be wrong and even more room to overlook.

“…a tall order for a human.”

Unfortunately, data scientists, by nature, are subjective (at least in the short term) and slow when good data science must be objectively contextual and quick to deploy since there are so many different ways to develop a solution. A good solution must be dynamic when there may be thousands of options. A good product will be objective, context driven, and be able to capitalize on new information stemming from a rapidly changing environment. This is a tall order for a human. In fairness, data science is manual and tough (there’s a tremendous amount grunt work involved) and in a world of many “not wrong” paths, the optimal solution may not be quickly obtained, if at all. That said, a data science team might not be an ideal end-to-end solution when the goal is for a long-term auto-dynamic solution that is adaptive and can to be deployed in an live environment rapidly and that can scale quickly across different use cases.typical solution

“…a good solution must be dynamic…”

End-to-end PdA platforms are available (Data Ingestion -> Data Cleansing/Modification -> Prediction -> Customer Interfacing). Predikto is one such platform where the difference is auto-dynamic scaleability that relieves much of the burden from a data science team. Predikto doesn’t require a manual data science team to ingest and modify data for a potential predictive analytics solution. This platform takes care of most of the grunt work in a very sophisticated way while capitalizing on detail from domain experts, ultimately providing customers with what they want very rapidly (accurate predictions) at a fraction of the cost of a data science team, particularly when the goal is to deploy predictive analytics solutions across a range of problem areas. This context-based solution also automatically adapts to feedback from operations regarding the response to the predictions themselves.

Predikto Solution Utilizing Predictive Analytics

 

Skeptical? Let us show you what Auto-Dynamic Predictive Analytics is all about and how it can reduce downtime in your organization. And by the way, it works… [patents pending]

Predikto Enterprise Platform

How to increase the profitability of your organization by 20%

GraphProfitability

A recent report published by Gartner state that organizations that use predictive business performance metrics will increase their profitability by 20% by 2017. The report also shows that organizations should alert workers that a business moment is about to occur, and guide them on the next action to take in the context of a particular customer’s expectation.

“Using historical measures to gauge business and process performance is a thing of the past”, says Samantha Searle, analyst at Gartner. “To prevail in challenging market conditions, business need predictive metrics – also known as ‘leading indicators’ – rather than just historical metrics (aka ‘lagging indicators’).” The “Predictive risk metrics are particularly important for mitigating and even preventing the impact of disruptive events on profitability” added.

Samantha also stated that “business process directors who don’t apply predictive metrics to cross-boundary business processes will leave their organizations vulnerable to the risk of failing to execute their business strategies.”

To successfully implement a predictive analytics strategy that will help to achieve the business performance expected, Gartner recommend the following:

  • Identify the business processes that are critical to driving strategic business outcomes and strategy execution.
  • Determine how bet to measure business outcomes in a way that triggers human or automated actions before an undesired outcome occurs.
  • Explore how they can leverage existing operational data, analytics and other sources of information in more predictive algorithms.
  • Employ predictive risk metrics to avoid process failure or business disruption

Another Gartner survey showed that 71% of a universe of 498 business and IT leaders understood which KPI’s are critical to supporting the business strategy, but only 48% of them can access those metrics, and not more than 31% agreed to have a dashboard to provide visibility to those metrics. “Visible metrics won’t hep drive strategic business outcomes, such as increasing profitability, if business and IT leaders don’t have the right metrics in place”, said Ms. Searle.

Predikto is unique positioned to help organizations deliver significant operational process improvement value by leveraging the power of predictive analytics.

You can read the Gartner press release here http://www.gartner.com/newsroom/id/2650815

Bye ‘Big Data’. Hello ‘Smart Analytics’

John De Goes wrote an interesting blog posting on Venture Beat where he describes that the term “Big Data” is dead.  His arguments are strong and Predikto agrees with his statements.  We run into the term Big Data and Predictive Analytics a lot.  In many cases it does not apply and people have not agreed on a definition for Big Data.  Sometimes we see vendors using Predictive Analytics when their software or technology solution does not predict the future using machine learning or statistical methods.  The “Predictive Analytics” technology just sends an automated notification using advanced GUI to someone to react once a sensor has reached a threshold.  We also run into many cases where someone talks big data, but you can open the CSV files in Excel.

Predikto, like John De Goes, like the term smart data, but we like to focus on analytics that enables an action that impacts the bottom line.  We call this “Smart Analytics”.  When you combine the human skills from smart scientists, developers, process experts, and infrastructure gurus, with the technologies used to enable solutions, you have a winning combination.  It’s this careful balance of combining all components into a solution that enables an action that makes it really challenging and really FUN.

smartanalytics