IKH

Regression vs Time Series

In the earlier modules, you learned about regression and now you may want to know the difference between time series forecasting and simple regression. The results obtained by regression alone are not enough for making an accurate forecast. In this segment, you will learn how time series forecasting is different from simple regression.

Time series is a series of time-stamped values. In other words, it is a sequence of values with time values attached to it. Here, the order attached to the values is very important which is not the case with a normal regression model.

Using time series analysis, you can forecast:

  • the value of the stock market index for a future month
  • the value of the literacy rate for a future census
  • the population of a nation in the coming year

We will look into exactly how this is done later in the module. Now, you may think that you can just use regression or advanced regression to make a forecast, taking time as an independent variable. However, this will not work due to various reasons. One reason is that in a time series, the sequence is important. For example, let’s take the data provided below.

Time StampValue
12.4
23.1
35
44.5
57.2
66.8

Using regression or advanced regression, let’s say you predict the value for timestamp 7. Now let’s say you shuffle the data around like this.

Time StampValue
12.4
24.5
37.2
43.1
55
66.8

Even though we shuffled the data in the value’s column, linear regression works on the linear relationship between the variables and thus the above data will also give you the same prediction for timestamp 7 if you use regression on the second table. However, a time series analysis will give you different forecasts for the original data and for the shuffled one.

Why does this happen? This happens because while forecasting using time series, your model predicts not only on the basis of the values given but also on the basis of the sequence in which the values are given. Hence, the sequence is very important in a time series analysis and should not be played around with.

So, the two most important differences between time series and regression are:

  • Time series have a strong temporal (time-based) dependence — each of these data sets essentially consists of a series of time-stamped observations i.e. each observation is tied to a specific time instance. Thus, unlike regression, the order of the data is important in a time series.
  • In a time series, you are not concerned with the causal relationship between the response and explanatory variables. The cause behind the changes in the response variable is very much a black box.

For example, let’s say you want to predict what the value of the stock market index will be next month. You will not look at why the stock market index increases in value or if it’s because of an increase in GDP or there are some changes in any sector or some other factor. You will only look at the sequence of values for the past months and predict for the next month, based on that sequence.

Report an error