In this session, you learnt the basic architecture of RNNs, some variants of the architecture applied to different types of tasks, and how the information flows between various layers during feedforward and backpropagation.
RNNs are designed to work on sequences which are present in many domains such as time series data, natural language processing, computer vision, music and audio, etc. you learnt that the order of the elements in sequence is very important and we need something more than a standard feedforward neural network to capture the relationships between entities in a sequence.
You studied the architecture of an RNN and its feedforward equations. There are two types of weight matrices – the feedforward weights which propagate the information through the depth of the network, and the recurrent weights which propagate information through time. The basic feedforward equation is:
alt=f(WlFal−1t+WlRalt−1+bl)
we also discussed types of RNNs and the tasks they are applicable to:
- Many-to-one architecture
- Many-to-many architecture
- Standard many-to-many architecture
- Encoder-Decoder architecture
- One-to-many architecture
Finally, you briefly studied backpropagation through time (BPTT) and how it leads to the problem of vanishing and exploding gradients in RNNs.
This brings us to the first session. In the next section, you’ll attempt the graded questions.
Report an error