Now that you have learnt what data streams are and their various industry use-cases. In this segment, you will learn the two aspects of Data Processing-Batch & Streaming.
So, let’s watch the upcoming video and learn about these from our expert.
So, in the video, you learn about batch processing and micro-batching.
Batch processing divides data into various batches based on a time window. It could be hourly, daily, weekly or monthly. Since we can have large time windows, it has the capability of producing high volumes of data.
However, this could be a potential cause of substantial latency for the BI layer. It would be a waste of time if we wanted the data to be processed instantly. So, what happens if we keep reducing the batch size?
We could set the time window to minutes, or even seconds, which would enable us to achieve near-real-time data availability, which is exactly what Spark Streaming does. It runs on micro-batches, with the trigger set to 0, to ensure data is read continuously. This makes it practically the same as streaming.
Additional Readings:
Batch vs. Streaming -To learn more about batch processing versus stream processing.
Report an error