In the previous segment, you saw that by trying different values of β0 and β1, you can manipulate the shape of the sigmoid curve. At some combination of β0 and β1, the ‘likelihood’ (length of yellow bars) will be maximised.

**Logistic Regression – Optimisation Methods (Optional)**

The question is – how do you find the optimal values of β0 and β1 such that the likelihood function is maximized? The optimisation methods used to do that are an optional part of the course (maximum likelihood estimation, or MLE). You can study the optimisation techniques in detail in a separate optional session here. The following concepts are covered in the optional session:

- Introduction to maximum likelihood estimation (MLE)
- MLE for continuous (normal) and discrete probability distributions (Bernoulli distribution and logistic regression)
- Optimising MLE cost functions using gradient descent
- Alternate optimisation methods: Newton-Raphson method

You may study the optional session either along with this module or sometime later; this will not interrupt your flow of study for this module (there is no deadline for the optional session). There are no prerequisites for the optional session apart from the previous module.

**Logistic Regression in Python**

Let’s now look at how logistic regression is implemented in python.

In python, logistic regression can be implemented using libraries such as SKLearn and statsmodels, though looking at the coefficients and the model summary is easier using statsmodels.

You can find the optimum values of β0 and β1 using the python code given below. Please download and run the code and observe the values of the coefficients (you may also want to visualise them on the interactive app on the previous page, though note that β0 can only take integer values in the app, so you can use -13 or -14 approximately).

Please note that you will study a detailed Python code for logistic regression in the next module. This Python code has been run so as to find the optimum values of β0 and β1 so that we can first proceed with the very important concept of **Odds** and **Log Odds**.

The summary of the model is given below:

Tabel

In the summary shown above, ‘const’ corresponds to β0 and Blood Sugar Level, i.e. ‘x1’ corresponds to β1. So, β0 = -13.5 and β1 = 0.06.

**Odds and Log Odds**

So far, you’ve seen this equation for logistic regression:

P=11+e−(β0+β1x)

Recall that this equation gives the relationship between P, the probability of diabetes and x, the patient’s blood sugar level.

While the equation is correct, it is not very intuitive. In other words, the relationship between P and x is so complex that it is difficult to understand what kind of trend exists between the two. If you increase x by regular intervals of, say, 11.5, how will that affect the probability? Will it also increase by some regular interval? If not, what will happen?

So, clearly, the relationship between P and x is too complex to see any apparent trends. However, if you convert the equation to a slightly different form, you can achieve a much more intuitive relationship. In the next video, let’s hear from Prof. Dinesh on how that can be done.

*[Note: By default, for this course, if the base of the logarithm is not specified, take it as e. So, log(x)=loge(x)*.*]*

So, now, instead of probability, you have **odds** and **log odds**. Clearly, the relationship between them and x is much more **intuitive** and easy to understand.

For example, if you increase x by regular intervals of, say, 11.5, how will that affect the log odds?

Please note that, in the video above, at **3:05**, instead of 2.94, it should have been **2.96**

So, the relationship between x and probability is not intuitive, while that between x and **odds/log odds** is. This has important implications. Suppose you are discussing sugar levels and the probability they correspond to. While talking about 4 patients with sugar levels of 180, 200, 220 and 240, you will not be able to intuitively understand the relationship between their probabilities (10%, 28%, 58%, 83%). However, if you are talking about the log odds of these 4 patients, you know that their log odds are in a **linearly increasing pattern** (-2.18, -0.92, 0.34, 1.60) and that the odds are in a **multiplicatively increasing pattern** (0.11, 0.40, 1.40, 4.95, increasing by a factor of 3.55).

Hence, many times, it makes more sense to present a logistic regression model’s results in terms of log odds or odds than to talk in terms of probability. This happens especially a lot in industries like finance, banking, etc.

That’s the end of this session on univariate logistic regression. You studied logistic regression, specifically, the sigmoid function, which has this equation.

P=1\1+e−(β0+β1x)

However, this is not the only form of equation for logistic regression. If you wish to learn about what the other forms are, you can go through them here. For this course, you do not need to know about the other forms, as we will not be discussing them anywhere.