Welcome to the module ‘Tree Models’. In this module, you will learn about two important machine learning models: decision trees and random forests.
You will first learn about decision trees and then proceed to learn about random forests, which are a collection of multiple decision trees. A collection of multiple models is called an ensemble.
With high interpretability and an intuitive algorithm, decision trees mimic the human decision-making process and are efficient in dealing with categorical data. Unlike other algorithms, such as logistic regression and support vector machines (SVMs), decision trees do not help in finding a linear relationship between the independent variable and the target variable. However, they can be used to model highly non-linear data.
You can use decision trees to explain all the factors that lead to a particular decision/prediction. And so can be used in explaining certain business decisions to entrepreneurs. Decision trees form the building blocks for random forests, which are commonly used among the Kaggle community.
Random forests are collections of multiple trees and are considered to be one of the most efficient machine learning models. By the end of this module, you should be able to use decision trees and random forests to solve both classification and regression problems.
In this session:
- Introduction to decision trees.
- Interpretation of decision trees.
- Building decision trees.
- Tree models over linear models.
- Decision trees for regression problems.
Guidelines for in-module questions
In-video and in-content questions are not graded.
People you will hear from in this session
Subject Matter Expert
Professor, IIIT-B
Analytics Lead, Flipkart
Presenter, Data Analytics
References:
An Introduction to Statistical Learning – Page 303–315