In the forthcoming video, Anjali will explain how to identify and handle non-linear data in Python. This type of data can be efficiently handled by polynomial regression.
So, in the video, you learnt that almost all the assumptions of linear regression were violated in case of non-linear data. You also observed the bimodal distribution with two different modes which also violates the assumption of normal distribution of residuals. Here, we present a very simple way to directly extend the linear model to accommodate nonlinear relationships, using polynomial regression. Let’s watch the next video to understand this better.
The kth-order polynomial model in one variable is given by:
y=β0+β1x+β2×2+β3×3+….++βkxk+ϵ
If xj = xj and j = 1, 2, …, k, then the model is a multiple linear regression model with k predictor variables, x1,x2,x3,….,xk. So, the linear regression model y=Xβ+ϵ includes polynomial predictors. Thus, polynomial regression can be considered an extension of multiple linear regression and, hence, we can use the same technique used in multiple linear regression to estimate the model coefficients for polynomial regression.
In the upcoming video, Anjali will explain how to build a model using Polynomial Regression in Python.
So, in the above example, we took the square of the independent variable and fitted a linear regression model. As a result, it provided us with better results.
In the next segment, we will look at another method to handle the non-linearity present in the data.