Solving Pong w/ Reinforcement Learning (Policy Gradients)

Install

python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

python train.py --epochs <NUM_EPOCHS>

After training for 1,000 epochs, the model becomes able to outplay the opponent.

python run.py

To use pretrained weights, rename model_1000.pt to model.pt and run.