Welcome to a dynamic module in the field of machine learning on exploratory data analysis, also known as EDA.
Prerequisites
Before proceeding with this module, you are expected to have a good grasp over the different Python libraries such as Pandas, numpy, matplotlib and seaborn. The moIn this moduledules for these libraries are provided as part of the optional preparatory content.
In this module
You will learn how to explore a data set end-to-end, i.e., how to extract the maximum insights from a data set and how to make useful business decisions based on those insights.
As you move ahead in this module, you will learn about the different steps involved in exploratory data analysis and also understand how to infer useful and actionable insights from a given data set. EDA is arguably the most important and revelatory step in any kind of data analysis.
Let’s quickly go through the module flow with anand.
By now, you have a fair understanding of EDA and the following broad topics that will be covered in this module.
- Data sourcing
- Data cleaning
- Univariate analysis
- Bivariate and multivariate analysis.
In order to understand the practical aspects of EDA, you will be working on a case study using the ‘bank telemarketing campaign’ data set implemented in python.
In this session
You will learn about various data sources and also learn how to source data from public and private sources. Data sourcing is the very first step of any data analysis activity. You will focus on public data sets, as they are open to use and fetch. Here, you will be introduced to certain useful websites and techniques such as web scraping, which are used to obtain data from websites.
Guidelines for in-module questions
The in-video and in-content questions for this module are not graded. Note that graded questions are given in a separate segment labelled ‘Graded Questions’ at the end of each session. The graded questions will adhere to the following guidelines:
First Attempt Marks | Second Attempt Marks | |
Questions with 2 Attempts | 10 | 5 |
Questions with 1 Attempt | 10 | 0 |
People you will hear from in this session
Subject Matter Expert
Mirza Rahim Baig
Analytics Lead, Flipkart
Flipkart is one of the leading e-commerce companies in India. It started with selling books and has now expanded its business to almost every product category, including consumer electronics, fashion and lifestyle products. Rahim is currently the analytics lead at flipkart. He holds a graduate degree from bits pilani, a premier educational institute in India.
Subject Matter Expert
Anand S
CEO, Gramener
Gramener is one of the most prominent data analytics and visualisation companies in India. Anand, currently the CEO, was previously the Chief Data Scientist at gramener and also has extensive experience in management consulting and equity research.