IKH

Summary

You have a long way! Let’s look a summary of all that you have covered in this session.

You learnt about the concept of Machine Translation and how it has evolved over the years.

Traditional NMT model

Sequence-to-sequence (seq2seq) model follows an encoder-decoder architecture, where it is made up of two RNN’s. Both the encoder & decoder consists of a series of RNN cells where each layer’s output is the input to the next layer.

A quick look at the encoder-decoder architecture used for MT:

Here’s the process followed in this architecture:

  • The encoder RNN takes the input sequence and encodes it into a fixed size context vector by reading the input tokens one at a time.
  • Once all the input tokens are read by the encoder, a special token is passed to the encoder- <stop>/<end>. This token (which can be named anything as its just an indicator for the model) indicates the encoder to stop encoding and pass the last cell’s hidden state to the decoder.
  • The context vector generated by the final cell’s hidden state is then fed to the decoder as the input.
  • Along with the context vector, the decoder receives a special token –<start>/<sos> (start of sentence) that indicates the model to start decoding.
  • The decoder RNN uses the context vector to generate the output sequence. The first cell of the decoder is initialised with the hidden state that it received from the encoder.
  • The decoder acts as a language model and predicts the next word based on the previous prediction and the hidden step passed from the cell at the previous time step.
  • Once the decoder has provided the relevant translation, a special token is generated – <end>/<eos> indicating the end of the target sentence.

At the end of this session, you also understood how to translate the theoretical understanding behind the NMT model and implement it using Tensorflow.

You can download the lecture notes for this module from the link below:

Report an error