#### Why matrices?

You might be wondering whether there are any other representations for linear regression that can be used to further simplify the solution. In the next video, Anjali will introduce you to the matrix representation of linear regression and explain how it can be used to compute the model coefficients for linear regression models.

#### Note

At 02:30, the dimensions are incorrectly written as (n,1) for the model coefficient vector. It should be (2,1).

So, in the video, you saw how the complex form of simple linear regression equations is converted to its matrix equivalent. Each observation can be represented with the following set of equations for ‘n’ observations:

The equations above have been converted to their matrix/vector equivalent as shown below:

The X matrix, i.e., the matrix with the predictors, is also known as the **design matrix.** Here, the first column in the design matrix is a column of 1’s for the intercept term β0 and the second column contains the predictor values. There are n rows in the matrix for ‘n’ observations. The error vector consists of residuals for each observation, which is added to the product of the design matrix and the parameter vector to obtain the response vector, Y.

You may want to refresh your memory of matrices by visiting the following link.

This matrix can be written in a very concise notation as:

Before delving further into the matrix representation, let’s try and understand some of the benefits of using matrices:

- Formulae become simpler, and more compact and readable.

- Code using matrices runs much faster than explicit ‘for’ loops.

- Python libraries, such as NumPy, help us build n-dimensional arrays, which occupy less memory than Python lists and computation is also faster.

#### Best parameter values through normal equations using matrices

In simple linear regression, we obtained the values of b0 and b1 by solving the normal equations using basic algebra. Let us see how can we use matrices for deriving the solution for β coefficients.

As we saw in the video, to get RSS, we multiply the error matrix by its transpose, as shown below:

Then, we differentiate RSS w.r.t. β and equate it to 0, as shown below:

Please note that we have skipped the differentiation step in the above derivation. It is not required to understand that here. The main objective is to understand that with simple matrix operations you can find the value of the coefficients.

So, this is how we get our beta coefficients for the least RSS. The same expression can be used irrespective of the number of predictors or variables that are considered in the model. For example, we can use this solution to find the coefficients for simple linear regression as:

The same equation can be used to find multiple coefficients present in Multiple Linear Regression. Now, in the forthcoming video, we will use this formula in our Python code and check whether or not we get the same solution.

So, in the video, Anjali built a simple linear regression model on the marketing data set and verified the results obtained from the normal equations and from matrix calculations. So far, you have built and verified a simple linear regression model. Now, as you are already aware, the next step is to build multiple linear regression models. In the next segment, you will learn about multiple linear regression model.