Using the Spark Datasource API to access a Database

By |April 10th, 2015|Categories: Predictive Analytics, Technology & Engineering|Tags: , , , , |

At Predikto, we’re big fans of in-memory distributed processing for large datasets. Much of our processing occurs inside of Spark (speed + scale), and now with the recently released Datasource API with JDBC connectivity, integrating with any datasource got a lot easier. The Spark documentation covers the basics of the API and Dataframes. There is a lack of information on actually getting this feature to work on the internet, however.
TL;DR; Scroll to the bottom for the complete Gist.
In this […]