Before going into the implementation part of linear regression, let’s have a quick recap of the basics of the Linear Regression algorithm. Go through the following video where Ajay would take you through the concepts involved in linear regression algorithm.
As explained by Ajay, machine learning algorithms are classified into:
- Supervised Learning: Linear regression, Logistic Regression
- Unsupervised Learning: Clustering
As explained, the Linear Regression model attempts to explain the relationship between a dependent variable and an independent variable using a straight line. For example, the Sales Prediction of a company based on the marketing budget where:
- Sales prediction is a dependent variable.
- Marketing budget is an independent variable.
Now the first step can be visualizing the historic data points present and drawing a scatter plot.
Now what you want to do is fit a linear straight line through these data points which can learn the behaviour from these data points and predict the value of actual sales given a marketing budget. So a simple linear line can be fitted as:
Now the equation of any straight line can be written as:
Y=βo+β1X
Where,
βo= Intercept
β1= Slope
Now all the points in a dataset can’t lie on the linear regression line. So the difference between the actual value and the predicted value is residual. In the next video let’s hear from Ajay about the residuals and finding the best fit line.
As explained above, residuals can be considered as errors between the actual value and the predicted value. The residuals or error is written as:
ei=yi−ypred
Now the sum of squared error is written as:
e21+e22+….+e2n (Residual Sum of Squares)
RSS=(Y1−βo−β1X1)2+(Y2−βo−β1X2)2+….+(Yn−βo−β1Xn)2
So, RSS=∑ni=1(Yi−β0−β1Xi)2
RSS is also known as Residual Sum of Square. Now in order to get the best fit line, the RSS value should be minimized to get the optimal value of β0 and β1 . Now the RSS can be minimized using the following methods:
- Gradient descent
- Differentiation
Gradient Descent is the most common approach used in the industry.
As explained by Sajan, the drawback of RSS is that it looks at the absolute number. In order to mitigate this drawback, the strength of the Linear Regression model is mainly explained using R2
R2=1−(RSSTSS)
Where,
RSS=Residual Sum of Square.
TSS=Total Sum of Squares.
TSS is a measure of how a data set varies around a central number (for example, mean).
As discussed by Sajan in the video, higher is the value of R2, better is the model.
In the subsequent segments, you will learn about the model-building techniques and then linear regression in little more detail.
Report an error