Here’s a brief summary of what you learnt in this session:

- Machine learning models can be classified into the following two categories on the basis of the learning algorithm:

**Supervised learning method:**Past data with labels is available to build the model.

**Regression:**The output variable is continuous in nature.

**Classification:**The output variable is categorical in nature.

**Unsupervised learning method:**Past data with labels is not available.

**Clustering:**There is no predefined notion of labels.

2. Past dataset is divided into two parts in the supervised learning method:

**Training data**is used for the model to learn during modelling.

**Testing data**is used by the trained model for prediction and model evaluation.

3. Linear regression models can be classified into two types depending upon the number of independent variables:

**Simple linear regression:**This is used when the number of independent variables is 1.

**Multiple linear regression:**This is used when the number of independent variables is more than 1.

4. The equation of the best fit regression line Y = β₀ + β₁X can be found by minimising the cost function (RSS in this case, using the ordinary least squares method), which is done using the following two methods:

**Differentiation**

**Gradient descent**

5. The strength of a linear regression model is mainly explained by R², whereR² = 1 – (RSS/TSS).

**RSS:**Residual sum of squares.

**TSS:**Total sum of squares.

6. RSE helps in measuring the lack of fit of a model on a given data. The closeness of the estimated regression coefficients to the true ones can be estimated using RSE. It is related to RSS by the formula: RSE=√RSSdf, where df=n−2 and n is the number of data points.

**Additional Reading:**