In the previous segment, you learnt about data pipelines and the phases involved in the general structure of a data pipeline. In this segment, you will see a real-life example and understand why the data pipelines must be automated or orchestrated.
In the next video, Amit will discuss Uber’ example data pipeline in detail.
In the previous video, you looked at a real=world data pipeline and saw all steps involved in it.
As you can see in the following diagram the phases we discussed in the earlier segment – Extraction > Storing raw data > Validating > Transforming > Visualising – can also be seen in the Uber example.
Next, we discussed the need for automating such pipelines listed as follows:
- It can send notifications when processed data is available or if something fails.
- The reduced manual effort allows companies to focus on business logic.
- Tracking the performance of the different steps in the pipeline helps companies identify and resolve bottlenecks.
- Automating data pipelines allows the company to collect, process and economically use data in real-time.
Additional Reading
Airbnb data pipeline/Infrastructure – You can read about another real-world pipeline (Airbnb) here.