So, this marks the end of the case study. Here’s what you’ve learnt in the past two sessions.
Summary of case study
- First, you did a fair bit of data handling and cleaning – cleaning junk records, adding missing values, changing data types, remove outliers, etc.
- When you analysed the ratings using the histogram, you saw that they are skewed towards higher ratings.
- Using a bar chart, you saw that most of the apps belong to the Everyone category.
- You also observed a weak trend between the ratings and the size of the app, using a scatter-plot. You also briefly forayed to reg plots to understand its nuances.
- Using a pair-plot, you were able to see multiple scatter plots and draw several inferences, for example, price and rating having very weak trend, reviews and price being inversely related and so on.
- After that, you utilised estimator functions along with bar plots as well as box plots to observe the spread of ratings across the different Content Rating Categories. Here, your main observation was that Everyone category has a lot of apps having very low ratings.
- Finally, you created a heat map comparing the ratings across different Reviews and Content Rating buckets.