Building Decision Trees in Python

In the previous session, we built a decision tree on default hyperparameters. Let’s now learn how to tune some of these hyperparameters and check what difference they make to the model performance

Let’s first create some helper functions to evaluate your model, as you would be doing it multiple times and creating these functions would ease the job.

Now that you have created some helper functions, let’s change some of the default hyperparameters, such as max_depth, min_samples_split, min_samples_leaf and criterion (Gini/IG or entropy), and understand how they impact the model performance.

In this video, you learnt how changing the default hyperparameters improve the model performance. Let’s now watch the following video to look at how entropy can be used instead of Gini to measure the quality of a split.

Now, what you did so far was just exploring the hyperparameters and see how they affected the model performance. The numbers you chose for the hyperparameters just now was more or less random. However, there should be some way to choose the optimal values for the hyperparameters, right? In the next segment, you will learn how to tune the hyperparameters to find their optimal values using k-fold cross-validation.

Report an error