Contains a few implementations of RL algorithms based on the code given by denny britz on wildml.com which itself is based on the lectures by David Silver and Richard Sutton’s & Andrew Barto’s Reinforcement Learning: An Introduction (2nd Edition) book.
- Dynamic programming policy iteration
- Dynamic programming policy evaluation
- Dynamic programming value iteration
- Gambler's Problem
- Monte Carlo Prediction
- Monte Carlo Control woith epsilon greedy policies
- Off-Policy Monte Carlo Control with weighted importance sampling
- SARSA (on-policy TD learning)
- Q learning (off-policy TD learning)
- Q learning with value function approximation