i have som question

in run_scripts/train_baseline.py
Hi, origin paper use A3C to train the agent, but I found in the above file that each agent will be assigned a PPO policy network, so which network will be trained？A3C or PPO?i first time use rllib and ray,i didn’t understand why To set up like this