Now that you have learnt the various error measures, in the next video, you will learn how cross-validation is performed in time series forecasting.
Basically cross-validation in time series is not done in the same way you might have done for any of the classical machine learning algorithms. This caveat stems from the fact that order matters in time series. So while building a forecast model, the test dataset is always on the right-hand side of the train dataset. Let’s understand the two types of validation you learnt just now.
- One-Step Validation
- The testing set is just one step ahead to the training set. So suppose out of 15 data points, you decided to keep the first 10 of them as ‘train’ and the next 5 of them as ‘test’. Now, the data points in the test set will be taken one-by-one starting from the left since you need all the previous values to predict the future values. So firstly, you take the 11th point to be the test set. Once this value is successfully forecasted, you move on to the 12th point which is now your new test point and so on until you forecast for all the 5 points. This idea is represented in the image below. Here the blue squares represent the train data, the yellow squares represent the test data, and the purple ones are the future values for which the initial test points need to be predicted first. In the image below, you have 7 test data points, hence it takes 7 iterations (or forecasts) to fully predict the test set.
- Multi-Step Validation
- This is the same as one-step validation, the only difference being that you do not consider a few points to the immediate right of the last training datapoint but rather skip a few of these points to make forecasts well into the future. This can be seen in the image below.