You have learnt some classical machine learning algorithms like linear regression, logistic regression etc. for solving both regression and classification problems. Then what do you think is the need of going ahead with these linear models. Linear models cannot handle collinearity and non linear relationships in the data well. Now here comes the role of decison trees which leverages these properties. You will learn about each of these in detail as you go ahead.
A decision tree, as the term suggests, uses a tree-like model to make predictions. It resembles an upside-down tree and uses a similar process that you do to make decisions in real life, i.e., by asking a series of questions to arrive at a decision.
A decision tree splits data into multiple sets of data. Each of these sets is then further split into subsets to arrive at a decision. Let’s hear from Prof. Raghavan as he explains this process in detail.
As you saw in this video, a decision tree uses a natural decision-making process, i.e., it asks a series of questions in a nested if-then-else structure. On each node, you ask a question to further split the data that is held by the node. If the test passes, you move to the left; otherwise, you move to the right.
The first and top node of a decision tree is called the root node. The arrows in a decision tree always point away from this node.
The node that cannot be further classified or split is called the leaf node. The arrows in a decision tree always point towards this node.
Any node that contains descendant nodes and is not a leaf node is called the internal node.
In the next segment, you will take a look at some real-life examples to understand decision trees better.
Additional Reading: