Policy Gradient and Q-Learning are techniques in Reinforcement Learning (a branch of Machine Learning), while the latter has a higher popularity thanks to its victories in video games and the Game of Go against human experts. However, previous successful applications of Reinforcement Learning to the financial portfolio management problem were mostly limited to Policy Gradient variants, due to the continuous action-space in the problem. In this project, we will employ a simple discretization scheme to fit the problem into the discontinuous Q-learning technique. The validity of such approach will then be examined by its performance over other more established methods.
- use gym
- use backtrader
- use btgym
- code from scrach
- ...