Having a unified data processing engine empowers Big Data application developers as it makes connections between seemingly unrelated use cases natural. This talk discusses the implementation of the so-called BigPetStore project (which is a part of Apache Bigtop) in Spark. It uses the Spark RDD API to generate transaction data, DataFrames and SparkSQL for ETL and reporting, MLlib for building a recommender system on the transaction data and Spark Streaming to serve online recommendations.
Solution Architect, Cloudera
Balassi Márton is a Solution Architect at Cloudera. He focuses on Big Data application development, especially in the streaming space. Marton is a regular contributor to open source and has been a speaker of a number of Big Data related conferences and meetups, including Hadoop Summit and Apache Big Data recently.