IKH

Stream Processing Topology

In this segment, you will understand what Streams and Stream Processing Application are. You will also understand stream processing topology. Let’s watch the next video to learn about this in detail.

A stream represents an unbounded, continuously updating data set. It is an ordered, replayable and fault-tolerant sequence of immutable data records. Each data record is defined as a key-value pair. Any program or application that processes a stream of data is called a stream processing application.

Let’s look at the processor topology and understand its different components.

A processing topology is nothing but a graph wherein stream processors act as nodes that are connected by streams, and these streams act as edges. The source processor does not have any upstream processor. it reads data from one or more Kafka topics and then produces a stream that acts as an input to other downstream processors.

A stream processor receives input records from its upstream processors and then applies some operations to this stream to produce one or more output streams for its downstream processors.

A sink processor does not have any downstream processor. It sends the records received from its upstream processors to specified Kafka topics.

Let’s hear from Vishwa as he explains the role of different processors using an example.

In the video above, Vishwa explained the role played by the different processors. The Kafka Streams application has to respond to the different words that are coming into a topic and count the occurrence of each word. The source processor reads all the words from the Kafka topic and moves the record to the stream processor. The stream processor then counts the occurrence of each word and passes this information to the sink processor. The sink processor writes this information back to some other Kafka topic.

Additional Reading

Stream Processing- Read to learn more about the needs of stream processing and why it is necessary for today’s scenario.

Report an error