Install required packages with:
pip install gymnasium matplotlib opencv-python pygame stable_baselines3
To set up the maze environment run:
python setup.py install
- One agent attempts to get to the fixed end of the maze (green) while the other attempts to get to the randomly spawning object (red).
- Each env generated per agent, goes through a series of mazes that are identical across the two agents.
- The maze is reset either when (i) the episode length has reached max timesteps (500) or (ii) the agent has reached its goal.
- In log_agent#.csv file: For each agent, separate log files are recorded with the episode number and environment number. The number of timesteps when the agent reaches the goal state (since the last maze reset) is recorded.
- In performance_agent#.csv file: For each agent, separate performance files are recorded with the episode number and the following metrics for that episode - cumulative reward, pi loss, value loss, entropy loss.
- Run train_agent#.py to train
- Models are saved into saved_models/agent#/{timestep}.pt
- Log data is saved into log_agent#.csv
- Performance data is saved into performance_agent#.csv
- Run plot_agent#.py to visualize performance the mean reward and loss graphs.