IKH

Summary

In this session, you learnt about the learning algorithms behind decision trees. In a step-by-step manner, you pick an attribute and a rule to split data into multiple partitions to increase the homogeneity of the data set.

You also learnt about the various ways in which you can measure the homogeneity of a data set, such as the Gini index, entropy and MSE.

Now, let’s summarise your learnings so far:

  • A decision tree first decides on an attribute to split on.
  • To select this attribute, it measures the homogeneity of the nodes before and after the split.
  • You can measure homogeneity in various ways with metrics like Gini index and entropy.
  • The attribute that results in the increase of homogeneity the most is then selected for splitting.
  • Then, this whole cycle is repeated until you obtain a sufficiently homogeneous data set.

Report an error