Model Assessment and Comparison

Once the model is built, you would want to assess it in terms of its predictive powers. For multiple linear regression, you may build more than one model with different combinations of the independent variables. In such a case, you would also need to compare these models with one another to check which one yields optimal results.

Let’s hear more on this from Rahim.

Now, for the assessment, you have a lot of new considerations to make. Besides, selecting the best model to obtain decent predictions is quite subjective. You need to maintain a balance between keeping the model simple and explaining the highest variance (which means that you would want to keep as many variables as possible). You can do this using the key idea that a model can be penalised for keeping a large number of predictor variables.

Hence, there are two new parameters that come into the picture

Adjusted R2=1−(1−R2)(N−1)N−p−1

AIC=n×log(RSSn)+2p

Here, n is the sample size, meaning the number of rows you would have in the data set, and p is the number of predictor variables.

Coming up

The adjusted R² adjusts the value of R² such that a model with a larger number of variables is penalised. In the next segment, Rahim will talk about feature selection.

Additional reading

AIC
BIC
Mallows’ CP

Report an error