Hyperparameter Tuning using Randomized Search CV

So, how do you perform all the previously discussed model tuning steps for a larger data set? You saw that for just 8,000 observations, it took you around 5 mins to execute all possible combinations of models using GridSearchCV. Now, you cannot afford to keep running these codes for days in order to get the optimum results in case of larger data sets. You must implement other efficient techniques of hyperparameter tuning in such cases. So, let’s watch the next video and learn about these techniques.

So, as you learnt in the video, RandomizedSearchCV is a highly efficient technique that is used to identify the best set of hyperparameter values in a fewer number of iterations. This technique performs quite well at a reduced cost and a shorter time for huge data sets and models with large numbers of hyperparameters.

Randomized search CV is similar to grid search CV but randomly takes samples of parameter combinations from all possible parameter combinations.

It does not perform an exhaustive search of hyperparameters over ranges where it does not find any merit; instead, it hops over a wide space in far less time as compared with GridSearchCV. It moves within the grid in a random fashion to find the optimum set of hyperparameters.

When you have a small data set, the computation time will be manageable to test out different hyperparameter combinations. In this scenario, it is advised to use a grid search.

However, with large data sets, high dimensions will require a prolonged computation time to train and test each combination. In this scenario, it is advised to use a randomized search because the sampling will be random and not uniform.

RandomizedSearchCV produced almost the same result as GridSearchCV in just a matter of seconds. You can surely go ahead and perform fine-tuning for yourself with randomised search cv. In the next video, we will extract the best model and assess the test performance.

So, after model tuning, you could see that the result had improved. The best model changes with the objective function that is defined for example: RMSE, accuracy, recall etc. and also with the assumptions under consideration from the business perspective. This may lead to different models and different sets of hyperparameter values, with corresponding predictive performances.

The cross-validation scheme used for this demonstration of the classification problem is called stratified k-Fold, which means the relative class proportion is maintained in each train–test split; nevertheless, this is not the only cross-validation scheme that is used. There are multiple other schemes that can be used and are available in scikit learn. An optional demonstration for hands-on practise of other cross-validation schemes in Python can be accessed here for those who want to explore and understand in detail.

Report an error