Open
Description
Once we have an implementation of the Layer Class #17 , the Optimizer class and the DataSet class we can go about creating RNN flavors. There are 3 models that should be implemented:
- Vanilla RNN
- LSTM
- GRU
These will require the implementation of their derivatives and their forward prop values.
Certain details to consider:
- RNN's have a stack of weight matrices and bias' (not just 1 per Layer, thus the Layer needs to be general enough to handle this)
- The optimization needs to be handled via two methods:
- RTRL (real time recurrent learning) &
- BPTT (backprop through time)
To enable the above two methods of learning we should consider inheriting from Layer and implementing a Recurrent Layer.