Extreme Gradient Boosting (XGBoost) is similar to the gradient boosting framework but more efficient and advanced implementation of the Gradient Boosting algorithm.
It was first developed by Taiqi Chen and became famous in solving the Higgs Boson problem. Due to its robust accuracy, it has been widely used in machine learning competitions as well.
In the next video, with this understanding, let’s see how does XGBoost works
Let’s summarise what Snehanshu has said by recapping different tree algorithms we have studied so far:
- AdaBoost is an iterative way of adding weak learners to form the final model. For this, each model is trained to correct the errors made by the previous one. This sequential model does this by adding more weight to cases with incorrect predictions. By this approach, the ensemble model will correct itself while learning by focusing on cases/data points that are hard to predict correctly.
- Next, we will talk about Gradient Boosting. You learned about gradient descent in the earlier module. The same principle applies here as well, where the newly added trees are trained to reduce the errors (loss function) of earlier models. So, overall, in gradient boosting, we are optimizing the performance of the boosted model by bringing down the loss one small step at a time.
- XGBoost is an extended version of gradient boosting, where it uses more accurate approximations to tune the model and find the best fit.
Why is XGBoost so good?
- Parallel Computing: when you run xgboost, by default, it would use all the cores of your laptop/machine enabling its capacity to do parallel computation.
- Regularization: The biggest advantage of xgboost is that it uses regularization and controls the overfitting and simplicity of the model which gives it better performance.
- Enabled Cross-Validation: XGBoost is enabled with internal Cross Validation function.
- Missing Values: XGBoost is designed to handle missing values internally. The missing values are treated in such a manner that if there exists any trend in missing values, it is captured by the model.
- Flexibility: XGBoost is not just limited to regression, classification, and ranking problems, it supports user-defined objective functions as well. Furthermore, it supports user-defined evaluation metrics as well.
Because of parallel processing (speed) and model performance, we can say that XGBoost is gradient boosting on steroids.
Now, that you have gone through the basic intuition behind XGBoost, let’s study how to implement it practically.