Recurrent Neural Networks

Recurrent Neural Network Basics

Why we need RNNs:
- Variable length Sequence
- Long-term dependency
- Stateful representation
- Memory

Types of RNNs:

Single Output at Single Input Step:

Single Fixed length vector to series of output:

Bi-Directional RNNs: They are capable of capturing information from left to right and from right to left. It has two RNNs for capturing the information from either directions. eg. in speech recognition, to understand a phoneme (distinct unit of sound) at input "i" we need to gather information from "i+1" and "i-1" ( we need past as well as future steps)

Vector2Sequence Architecture (for Image Captioning): Single fixed length vector spitting out series of output can be used for Image captioning. From a Image --> Use CNN or MLP to extract features of an Image --> Extract feature vector --> use RNN to generate proper caption i.e. word by word.

Word_next = Captioning ( Current_word, Image Feature Vector)

Seq2Seq Architecture (for Neural Machine Translation): Need to map a sequence to another sequence of different length. Encoder-Decoder model / Seq2Seq Model has 2 RNNs. Encoder process input sequence and does not emit output at each step. It captures input sequence, one word at a time and tries to capture task relevant information from sequence i.e. internal state. Final hidden state of the encoder is task relevant summary of input sequence - called a context or thought vector.

context act as only input to the decoder - initial state of the decoder can be a function of context or context can be connected to all the hidden states of the decoder. Both the hyperparameters of encoder and decoder can be different.

RNN Unrolled Version : Depth of a RNN - equal to number of time steps. With more hidden layers, we can stack RNNs to get deep RNNs for a input sequence. Between hidden to hidden connection with weight matrix - typically we have non-linear transformations i.e. CNN or MLP to learn higher level information. Deeper RNNs takes more to train.

Variants of RNNs: Based on the type of "CELL". Each "CELL" have unique gating mechanism i.e. how to control flow of the information from input to current state, from previous step to current state, and from current state to output.

Recurrent Neural Networks

Recurrent Neural Network Basics

Single Output at Single Input Step:

Single Fixed length vector to series of output:

Vector2Sequence Architecture (for Image Captioning): Single fixed length vector spitting out series of output can be used for Image captioning. From a Image --> Use CNN or MLP to extract features of an Image --> Extract feature vector --> use RNN to generate proper caption i.e. word by word.

Variants of RNNs: Based on the type of "CELL". Each "CELL" have unique gating mechanism i.e. how to control flow of the information from input to current state, from previous step to current state, and from current state to output.

RNN rolled Version

LSTM and GRU Cell

RNN Blogs

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally