In a previous segment, we had discussed sequences briefly. Let’s now take a look at how various types of sequences are fed to RNNs.
Now that you understand how an RNN consumes one sequence, let’s see how do you train a batch of such sequences.
You learnt how to feed data to an RNN. In the case of an RNN, each data point is a sequence.
The individual sequences are assumed to be independently and identically distributed (I.I.D.), though the entities within a sequence may have a dependence on each other.
As the network keeps ingesting new elements in the sequence it updates its current state (i.e. updates its activations after consuming each element in the sequence). After the sequence is finished (say after T time steps), the output from the last layer of the network aLT captures the representation of the entire sequence. You can now do whatever you want with this output – use it to classify the sentence as correct/incorrect, feed it to a regression output, etc. This is exactly analogous to the way CNNs are used to learn a representation of images, and one can use those representations for a variety of tasks.
You also saw how the data can be fed in batches just like any normal neural net – a batch here comprises of multiple data points (sequences).
In the nest section, you’ll look at the architecture of an RNN.
Report an error