Deep Reinforcement Learning
DRL university course lecture notes & exercises
Chapter
Sections recap
Hello world
Basic terminology and definitions (based on spinning up RL, by openAI )
RL Basics
MDPs, Polciy/Value-Iteration, MC, SARSA & Q-Learning
DQN & it's derivatives
Deep Q-Network (DQN), Double DQN, Dueling-DQN
Policy Gradients
REINFORCE, REINFORCE with Baseline, Actor-Critic methods
Imitation Learning
Apprenticeship, Supervised and forward learning. Dagger, Dagger with coaching
Multi-Armed Bandit
Bandit algorithm, Gradient based algorithm, contextual bandits, Thompson sampling
RL use-case: AlphaGo
Monte Carlo Tree Search, AlphaGo, AlphaZero
Meta and Transfer Learning
Concepts in Meta learning and Transfer learning in the context of RL
Large action spaces
Examining some papers discussing handling with large action spaces
Advanced model learning & exploration
Learning in latent space, next states predictions, exploration schemes
Exercise
Description
ex1
Q-Learning and Deep-Q-Learning (DQN) implementations from scratch
ex2
REINFORCE (with and without baseline) and Monte Carlo Actor-Critic implementations from scratch