In this segment, you will analyse all the results generated by the models using the One vs Rest classifier and also view the various intercepts and parameters created or tuned by the model. Let’s hear more about it from Ankit.
In this video, Ankit started off by creating a column called ‘Scaled_features’ containing the scaled values of the test data set. The actual classification of the test data set and the predicted values was stored in the columns ‘Actual’ and ‘prediction_oneVsrest’. Later, the code was executed as shown below.
X_test['prob_oneVsrest'] = OneVsRes.predict_proba(X_test_scaled).tolist()
So, what are the values stored in the column ‘prob_oneVsrest’? Each test sample, when passed to the One vs Rest classifier created, generates three probability scores. Since the data set has three class variables (high risk, low risk and medium risk), three logistic regression models will be created. Test samples are passed to each model separately, thus generating a total of three probability scores. So, each row in the ‘prob_oneVsrest’ column will contain the three probability scores corresponding to each model (to the respective class variable).
Next, Ankit executed the code given below, creating three columns ‘prob_oneVsrest_highRisk’, ‘prob_oneVsrest_lowRisk’ and ‘prob_oneVsrest_mediumRisk’.
X_test['prob_oneVsrest_highRisk'] = OneVsRes.predict_proba(X_test_scaled)[:,0].tolist()
X_test['prob_oneVsrest_lowRisk'] = OneVsRes.predict_proba(X_test_scaled)[:,1].tolist()
X_test['prob_oneVsrest_mediumRisk'] = OneVsRes.predict_proba(X_test_scaled)[:,2].tolist()
So, what are the values stored in the respective columns? As we already discussed, for each test sample, there will be three probability scores corresponding to ‘high risk’, ‘low risk’ and ‘medium risk’. We will be storing each probability generated in the corresponding column. For example, for a test sample ‘A’, the probability of A belonging to the ‘high risk’ category will be stored in the ‘prob_oneVsrest_highRisk’ column. Similarly, the probability of A belonging to ‘medium risk’ and ‘low risk’ categories will be stored in the ‘prob_oneVsrest_mediumRisk’ and ‘prob_oneVsrest_lowRisk’ columns, respectively. Then, we execute the X_test.head() code to analyse the values. In the image given below, you can see that the test sample is classified under a specific class category having the highest probability among the three values.
In the first record, the sample is classified as ‘low risk’, and you can see that the highest probability value is in the ‘prob_oneVsrest_lowRisk’ column, with a value of 0.864656. Similarly, you can check the other values as well. In the next video, we will analyse the various parameters and values created for each logistic regression model.
In the video, Ankit started off by executing the code provided below:
# Classes for which individual models are created
print(OneVsRes.classes_)
#Coefficient matrix for all the models created
print(OneVsRes.coef_.shape)
What do the results look like, and what do they represent?
The statement print(OneVsRes.classes_) represents all the class variables (target variables) in the data set. The output will be as follows:
- [‘High Risk’ ‘Low Risk’ ‘Medium Risk’] The output of the next statement print(OneVsRes.coef_.shape) is (3, 7). What does this mean? In logistic regression, you learnt about the parameters β0 and β1. By varying the values of β0 and β1, you will get different Sigmoid curves, and based on some function that you have to minimise or maximise, you will get the best-fit Sigmoid curve. As you know, the formula for calculating the probability is as follows: P=11+e−(β0+β1x) Here, β0 is called the intercept value and β1 is called the weight coefficient. You will learn about these terms in detail in later modules, but for now, you can just remember these terms. The parameter that is multiplied by input value ‘x’ is called the weight coefficient. In the equation given above, β1 is the weight coefficient. Note that for a particular data set, a model will create weight coefficients for each column that is used for training purposes. For example, in the diabetes data set that you considered in the logistic regression module, you had only one input feature (blood sugar level). So, we will have only one weight feature, that is β1,whereas in the loan data set there are seven columns which are as follows:
- loan_amnt
- int_rate
- installment
- emp_length
- annual_inc
- fund_perc
- incToloan_perc
So, we will have a total of seven weight coefficients. The 7 in the output (3,7) represents the number of weight coefficients created for each individual LR model. Similarly, for each model created, we will have only one β0 value, irrespective of the number of columns. In the diabetes example, we created only one LR model; so, we had only one intercept value (β0). On the other hand, in the loan data set, One vs Rest classifier creates three models. So, for each model, we will have a separate β0 value.
The 3 in the output (3,7) represents the number of LR models created which is three. To sumamrise:
- LR Model 1: Seven different β1 values and one β0 value
- LR Model 2: Seven different β1 values and one β0 value
- LR Model 3: Seven different β1 values and one β0 value
The code ‘print(OneVsRes.intercept_)’ shows the three intercept values created for each individual LR model. We are not going to focus on the numerical values, as the intention is to show that we will have well-trained parameters and the best-fit Sigmoid curve for each LR model.
The intercept values are shown below:
- Model 1: -1.97140289
- Model 2: 1.80477709
- Model 3: -5.06997072
Similarly, the code ‘print(OneVsRes.coef_)’ shows the seven weight coefficient values created for each individual LR model. Here also, we are not going to focus on the numerical values, as the intention is to show that we will have well-trained parameters and the best-fit Sigmoid curve for each LR model. The weight coefficients are shown below:
- Model 1: 0.43258834 0.56140827 -0.41564408 0.04909449 -0.3767037 0.0423191 -0.00791024
- Model 2: -1.162999 -0.61923122 1.13533291 -0.0702036 0.2797535 -0.2110285 0.08366904
- Model 3: 4.80649703 0.98055294 -5.47052603 0.15622182 0.17946747 1.28370612 -0.62392508
Later in the video, Ankit took a test sample and displayed its probability scores. The code given below was executed:
print(X_test.iloc[0]['prob_oneVsrest'])
Here, we got the following three probability scores for the test sample:
- 0.1352521804233434
- 0.8646560093263703
- 9.181025028619104e-05
So, how do we arrive at these probability scores? In the next segment, we will discuss the mathematical calculations executed internally by the LR models.
Report an error