Session Summary

With this, you have completed this session. Let’s hear from Vishwa as he summarises your learnings from this session.

Your learnings from this session can be summarised as follows:

Kafka Connect is a framework that connects Kafka with external frameworks. These external frameworks could be any streaming service or database.

Using Kafka Connect, you can move the data either from a Kafka topic to an external system or from an external system to a Kafka topic.

There are many different open source connectors that can be used for transferring data. Source Connectors are responsible for moving data from external systems to Kafka topics. Sink Connectors are responsible for moving data from Kafka topics to external systems.

Connectors coordinate a set of tasks that copy the data.

Workers are the running processes that are responsible for executing connectors and tasks. A Kafka Connect cluster is a group of workers. Depending on the number of workers, the workers are divided into two modes: standalone mode and distributed mode.

You learnt how to pull data from Twitter and stored the data in a Kafka topic.

Kafka Streams is a library used for building applications and microservices. All the input and output data are stored in the Kafka Cluster.

Stream processors are responsible for taking data from upstream processors, applying some transformations to the data and passing it to downstream processors.

The PPT that was used throughout this session in attached below.

Report an error