English

Modern Real-Time Data Analytics with Spark Structured Streaming

Structured Streaming, the next generation real-time data analytics component of Apache Spark was released last year. Structured Streaming allows you to think about your incoming data flow as an unbounded, ever-growing table. You use it as if it was a simple static database table and your query results get updated real-time. Spark figures out scaling and the heavy lifting behind the scenes.

This will be a hands-on demo where you will see this technology in action through an Industry 4.0 use-case.

At the end of this talk, you will understand the basic concepts of this technology and you will know how to move on if you want to integrate it into your own projects.

Tóth Zoltán
CTO, Datapao

I design and implement Big Data and Spark Architectures at Datapao, mostly for online and Industry 4.0 clients. Besides working on Data Infrastructures, I’m work as a senior consultant and instructor at Databricks, the company created by the founders of Apache Spark. Earlier I worked on the Spark integration project in Rapidminer and led the Data Engineering and the Business Modeling teams of Prezi.