A 2D top-down racing game where a car learns to drive around an oval track using Reinforcement Learning (PPO via StableBaselines3). The Godot engine simulates physics; Python handles training via TCP using the Godot RL Agents bridge.
Player ──── (manual play)
│
Godot 4 Game ◄────TCP────► Python PPO Training
│ │
8 Raycasts StableBaselines3
Physics sim checkpoints/models/
cd Self_Driver
git submodule add https://github.com/edbeeching/godot_rl_agents.git addons/godot_rl_agentscd training
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtTerminal 1 – Start Godot:
# Open project.godot in Godot 4 and press Play (F5)
# The game window will appear and wait for a TCP connectionTerminal 2 – Start Training:
cd training
source venv/bin/activate
python train.pyTraining will run for 500,000 steps (~30–60 min depending on hardware).
Watch live metrics: tensorboard --logdir ./tb_logs
python train.py --mode run --model models/self_driving_ppo_final.zip| Index | Description |
|---|---|
| 0–7 | Normalized raycast distances (0 = wall, 1 = clear air) |
| 8 | Normalized speed (0–1) |
| 9 | Heading angle delta to next checkpoint (−1 to +1) |
| Index | Range | Description |
|---|---|---|
| 0 | −1 to +1 | Steering (left/right) |
| 1 | 0 to +1 | Throttle (gas) |
| Event | Reward |
|---|---|
| Per step (speed bonus) | +0.1 × (speed / max_speed) |
| Checkpoint passed | +5.0 |
| Wall crash | −10.0 |
| Episode timeout (30 s) | Episode ends |
See training/train.py for all PPO hyperparameters. Key values:
| Param | Value |
|---|---|
| Algorithm | PPO |
| Steps per update | 2,048 |
| Batch size | 64 |
| Learning rate | 3e-4 |
| Discount (γ) | 0.99 |
| Network | MLP [256, 256] actor + critic |
| Total timesteps | 500,000 |
Once trained, you can load the model directly into Godot without Python:
- Export the SB3 model to ONNX (see
godot_rl_agentsdocs) - Place the
.onnxfile inmodels/ - Set
control_mode = 1on theSyncnode inMain.tscn - Point the
Syncnode to your.onnxfile - Press Play — the car drives itself with no Python!
- Add tracks: Duplicate
Track.tscn, change wall shapes and checkpoint positions - Add opponents: Instance multiple
Car.tscnnodes with different spawn points - Curriculum learning: Start with wider tracks, gradually narrow them
- Add braking: Extend action space to
[steering, throttle, brake]