IKH

Introduction

Welcome to the course project. will test your knowledge of the various tools related to batch processing, which you have learnt throughout this course. The project mainly revolves around Apache Sqoop, Apache PySpark, Amazon S3 and Amazon RedShift, which are some of the most widely used tools in the industry.

In this video, our expert will give a brief introduction to the ETL project that we will be going through in this module.

In this project, you will go through a real-world use case from the banking sector.

Your task, essentially, would be to build a batch ETL pipeline to read transactional data from RDS, transform it and then load it into Redshift Tables, after which you will have to perform some analytical queries on the loaded data.

People you will hear from in this project

Subject Matter Expert

Ganesh Gurusiddaiah

Big Data Technology Lead

Ganesh is a Big Data ETL Leader with expertise in data warehouse solutions. He has almost 10 years of experience in the software domain and holds a master’s degree from BITS Pilani in Software Systems.

Report an error