Deploying Predictive Analytics and Scalability: Part 1 – Patience is required, but can you afford it?

data processing in predictive analytics

Having been involved in applied data science for over a decade, one of the most substantive shortcomings of analytics that I’ve seen, in terms of a viable business product, is scalability. Data management, feature engineering, and multivariate statistics/machine learning are conceptually challenging topics and take time to master. Even for a seasoned data scientist (or team, or whatever you call it…) that can tackle the full-stack solution from data collection to predictive output/validation (i.e., end-to-end), the process is tedious and slow, and veterans are rare. Moreover, each new application, or instance, may require major re-development or starting over from scratch, even for different solutions stemming from the same data sources. In short, “good” data science is slow and arduous and won’t scale without considerable time and a considerable investment.  Tread with caution; this is as much art as it is science, but can you afford the time? Perhaps there may be a better way?

Robert Morris, Ph.D. is Chief Science Officer and Co-founder of Predikto, Inc.