IKH

Session Overview

Welcome to the session on ‘Industry Demo – Processing Tweets in Real-Time’.

In the previous session, you learnt about the concepts of window operations and also understood how late-arriving data can be handled using watermarks.

In the next video, our SME Ajay will provide an overview of the topics that you will learn in this session.

In the previous three sessions, you learnt about the various features of the Spark Streaming API, the general flow of code and the architecture of a Spark Streaming application. You also looked at a few transformations followed by window operations.

Now, you will take a look at an actual industry use case to understand the overall working of an application. First, you will how Spark can be integrated to read and write messages to Apache Kafka.

Next, you will be creating a Spark Structured Streaming application that would read tweets in real-time. This use case will demonstrate the complete flow of a Spark code right from the start.

Let’s get started!

People you will hear from in this session

Subject Matter Expert

Ajay Shukla

Senior Data Engineer

Ajay is currently working as a senior data engineer. He has over nine years of experience in the IT industry and has worked at various companies. He has deep knowledge of various tools and technologies that are used today.

Report an error