In this session, you’ll use a stock market price index data to implement the 1D CNN-RNN architecture. Generally speaking. a stock market index, along with a multitude of other factors, also gets affected by the news headlines that run daily on television and newspapers. Highly negative news impacts the stock market negatively and positive news impacts the stock market positively.
In this session, we’ll try to model this relationship between the news and the stock market price of an index. Our assumption in modelling the stock price in this exercise is that news headlines that run on a particular day affect the opening stock price of an index the very next morning.
That being said, the important thing to keep in mind while doing this exercise is to focus on the process of extracting textual features using 1D CNN and then feeding them to an RNN.
CNN with RNN
As you know from the previous segment, you can use a 1D CNN to extract meaningful features which results in much shorter sequences in a much faster way. You can then feed this vector to an RNN in the same way as one would feed a sentence. This will be more clear in the upcoming segments.
The original work behind using 1D CNNs with RNNs was proposed in this paper titled “Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts“. Feel free to go through the paper before proceeding. The concept of the CNN-RNN model proposed in the paper will be explained in this session, in detail.
Now, download the dataset from the link provided below.
The above dataset consists of two CSV files:
- The first file contains the stock market data that belongs to the Dow Jones Industrial Average index.
- The second file contains a series of dates and the corresponding American news headlines from major news sources of the country.
You’ll use the above data to model the effect of the news of a particular day on the opening price of the Dow Jones index on the next day.
Before moving further, download the notebook used in the session.
In the following lecture, Gunjan will explain how to work with a supervised learning problem involving text as the independent variable.
The value that the model should predict is the opening price of the stock market. Since the variable is continuous, it’s a regression problem.
Now, let’s see how you can apply the approach explained by Gunjan to this particular problem of using headlines to predict the stock price.
Having looked at the inputs and outputs of our model, let’s try to understand the Feature Extraction & Model building block. This is the architecture of the CNN-RNN model as proposed in the paper.
As you can see in the above architecture, multiple convolutional layers are applied in parallel to the ‘feature representation’ of the text. The feature representation of the text can be one-hot encoding or vector representation like word2vec, glove, etc. The output of the multiple convolutional layers are concatenated and RNN layer works on the top of it.
As we know, to do any classification or regression, we need the fully-connected layer as the output layer. So, fully-connected layer sits on the top of RNN. Also, we will not add ‘softmax’ output as it is used for classification and not regression. As a matter of act, there will be no activation to bound the output.
Additional reading:
Xingyou Wang, Weijie Jiang and Zhiyong Luo. December 17-11-2016. Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts
In the next few segments, first, you will go through the pre-processing steps for data cleaning and later implement the CNN-RNN model in Python.
Report an error