Project work for the course Autonomous and Adaptive Systems
It's a semi-decent implementation of the PPO algorithm, tested this on various ProcGen environments, using a combined model for both actor and critic.
Find all the options:
python main.py --help