Skip to content

Commit c7e3afa

Browse files
committed
update docs & configs
1 parent e7e080b commit c7e3afa

21 files changed

Lines changed: 439 additions & 6 deletions

README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
1-
# **WorldEngine: Towards the Era of Post-Training for Physical AI**
2-
31
<div align="center">
2+
<img src="docs/imgs/WE_title.png" width="800px">
3+
4+
# Towards the Era of Post-Training for Physical AI
45

56
[![Paper](https://img.shields.io/badge/Paper-Coming_Soon-b31b1b.svg?style=for-the-badge&logo=arxiv)](https://github.com/OpenDriveLab/WorldEngine)
7+
[![Hugging Face](https://img.shields.io/badge/Hugging_Face-Dataset-ffc107.svg?style=for-the-badge&logo=huggingface)](https://huggingface.co/datasets/OpenDriveLab/WorldEngine)
68
[![ModelScope](https://img.shields.io/badge/ModelScope-Dataset-orange.svg?style=for-the-badge)](https://www.modelscope.cn/datasets/OpenDriveLab/WorldEngine)
9+
<br>
710
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0.1-EE4C2C.svg?style=for-the-badge&logo=pytorch)](https://pytorch.org)
811
[![Python](https://img.shields.io/badge/python-3.9-blue?style=for-the-badge)](https://www.python.org)
912
[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)
@@ -20,7 +23,7 @@
2023
2124
## Table of Contents
2225

23-
- [**WorldEngine: Towards the Era of Post-Training for Physical AI**](#worldengine-towards-the-era-of-post-training-for-physical-ai)
26+
- [Towards the Era of Post-Training for Physical AI](#towards-the-era-of-post-training-for-physical-ai)
2427
- [Table of Contents](#table-of-contents)
2528
- [Highlights](#highlights)
2629
- [News](#news)
@@ -56,6 +59,7 @@
5659
## News
5760

5861
- **[2026/04/08]** Official code repository established. Data publication under preparation.
62+
- **[2026/04/09]** Official dataset released. See [OpenDriveLab/WorldEngine](https://huggingface.co/datasets/OpenDriveLab/WorldEngine) or [OpenDriveLab/WorldEngine (ModelScope)](https://www.modelscope.cn/datasets/OpenDriveLab/WorldEngine)
5963

6064

6165

docs/algengine_usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -390,7 +390,7 @@ bash scripts/run_ray_distributed_testing.sh \
390390

391391
## Configuration
392392

393-
AlgEngine uses hierarchical configuration with MMDetection3D.
393+
AlgEngine uses hierarchical configuration with MMDetection3D. For a detailed reference of all config parameters, variants, and their relationships, see the [Configuration Guide](config_guide.md).
394394

395395
### Configuration Hierarchy
396396

docs/config_guide.md

Lines changed: 354 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,354 @@
1+
# AlgEngine Configuration Guide
2+
3+
This guide provides a comprehensive reference for all configuration files under `projects/AlgEngine/configs/worldengine/`. Each config defines a complete training or evaluation experiment, covering model architecture, data pipeline, optimizer, and training schedule.
4+
5+
## Table of Contents
6+
7+
- [Configuration Overview](#configuration-overview)
8+
- [Base IL Training Configs](#base-il-training-configs)
9+
- [RL Fine-Tuning Configs (RLFT)](#rl-fine-tuning-configs-rlft)
10+
- [IL Fine-Tuning Config (ILFT)](#il-fine-tuning-config-ilft)
11+
- [Detection/Tracking Config](#detectiontracking-config)
12+
- [Common Parameters Reference](#common-parameters-reference)
13+
- [Config Comparison Table](#config-comparison-table)
14+
15+
---
16+
17+
## Configuration Overview
18+
19+
All configs inherit from `configs/_base_/default_runtime.py` and are organized into four categories:
20+
21+
```
22+
configs/worldengine/
23+
24+
├── Base IL Training (data scaling experiments)
25+
│ ├── e2e_vadv2_13pct.py
26+
│ ├── e2e_vadv2_25pct.py
27+
│ ├── e2e_vadv2_50pct.py ← reference config
28+
│ ├── e2e_vadv2_60pct.py
29+
│ ├── e2e_vadv2_70pct.py
30+
│ ├── e2e_vadv2_80pct.py
31+
│ ├── e2e_vadv2_90pct.py
32+
│ └── e2e_vadv2_100pct.py
33+
34+
├── RL Fine-Tuning (RLFT)
35+
│ ├── e2e_vadv2_50pct_rlft_common_log.py ← real logs, normal data only
36+
│ ├── e2e_vadv2_50pct_rlft_rare_log.py ← real logs, includes hard cases
37+
│ ├── e2e_vadv2_50pct_rlft_rare_rollout.py ← synthetic rollout (single source)
38+
│ ├── e2e_vadv2_50pct_rlft_rare_rollout_bwm.py ← synthetic rollout (multi-source)
39+
│ └── e2e_vadv2_50pct_rlft_rare_syn_replay.py ← synthetic replay, no real failures
40+
41+
├── IL Fine-Tuning (ILFT)
42+
│ └── e2e_vadv2_50pct_ilft_rare_log.py ← IL-only on rare cases
43+
44+
└── Detection/Tracking
45+
└── track_map_nuplan_r50_navtrain_50pct.py ← perception-only (UniAD)
46+
```
47+
48+
---
49+
50+
## Base IL Training Configs
51+
52+
**Files:** `e2e_vadv2_{13,25,50,60,70,80,90,100}pct.py`
53+
54+
These configs train the NAVFormer model end-to-end with Imitation Learning on different percentages of the NavSim training set. They are used for **data scaling experiments**.
55+
56+
### Key Characteristics
57+
58+
- **Model:** `NAVFormer` with `TrajScoringHead` (standard IL planning head)
59+
- **Backbone:** ResNet-50 (caffe style), frozen (`freeze_img_backbone=True`)
60+
- **Other modules:** img_neck, BN, BEV encoder are **NOT** frozen
61+
- **Pretrained weights:** `track_map_nuplan_r50_navtrain_50pct_bs1x8.pth`
62+
- **Dataset:** `NavSimOpenSceneE2E`
63+
- **Epochs:** 8, with evaluation at epoch 8
64+
65+
### Data Percentage Variants
66+
67+
The **only** difference between these configs is `nav_filter_path_train`:
68+
69+
| Config | Training Split |
70+
|--------|---------------|
71+
| `e2e_vadv2_13pct.py` | `navtrain_13pct.yaml` |
72+
| `e2e_vadv2_25pct.py` | `navtrain_25pct.yaml` |
73+
| `e2e_vadv2_50pct.py` | `navtrain_50pct.yaml` |
74+
| `e2e_vadv2_60pct.py` | `navtrain_60pct.yaml` |
75+
| `e2e_vadv2_70pct.py` | `navtrain_70pct.yaml` |
76+
| `e2e_vadv2_80pct.py` | `navtrain_80pct.yaml` |
77+
| `e2e_vadv2_90pct.py` | `navtrain_90pct.yaml` |
78+
| `e2e_vadv2_100pct.py` | `navtrain.yaml` (full set) |
79+
80+
### Planning Head (IL)
81+
82+
```python
83+
planning_head=dict(
84+
type='TrajScoringHead', # standard IL head
85+
reward_shaping=False, # no reward shaping in IL
86+
num_poses=40, # trajectory vocabulary size
87+
vocab_path="data/alg_engine/test_8192_kmeans.npy", # K-means trajectory clusters
88+
num_commands=4, # driving command types
89+
# No LoRA, no RL loss
90+
)
91+
```
92+
93+
---
94+
95+
## RL Fine-Tuning Configs (RLFT)
96+
97+
All RLFT configs fine-tune a **pretrained IL model** (`e2e_vadv2_50pct_ep8.pth`) using Reinforcement Learning, with LoRA adapters to keep parameter-efficient.
98+
99+
### Common RLFT Changes (vs Base IL)
100+
101+
| Parameter | Base IL | RLFT |
102+
|-----------|---------|------|
103+
| `planning_head.type` | `TrajScoringHead` | `TrajScoringHeadRL` |
104+
| `lora_finetuning` | N/A | `True` |
105+
| `freeze_img_backbone` | `True` | `True` |
106+
| `freeze_img_neck` | `False` | `True` |
107+
| `freeze_bn` | `False` | `True` |
108+
| `freeze_bev_encoder` | `False` | `True` |
109+
| `reward_shaping` | `False` | `True` |
110+
| `rl_finetuning` | N/A | `True` |
111+
| `importance_sampling` | N/A | `True` |
112+
| `orig_IL` | N/A | `True` (keep IL loss component) |
113+
| `load_from` | perception ckpt | IL e2e ckpt |
114+
| train pipeline keys | no `fail_mask` | includes `fail_mask` |
115+
116+
### RL Loss Weights
117+
118+
All RLFT configs use the same loss configuration:
119+
120+
```python
121+
rl_loss_weight=dict(
122+
bce=0.0, # binary cross-entropy (disabled)
123+
rank=0.0, # ranking loss (disabled)
124+
PG=0.01, # policy gradient loss
125+
entropy=1.0 # entropy regularization
126+
)
127+
```
128+
129+
### LoRA Configuration
130+
131+
LoRA adapters are applied in the planning head's transformer decoder:
132+
133+
```python
134+
use_lora=True, # in planning_head
135+
trans_use_lora=True, # in planning_head
136+
# In MotionTransformerAttentionLayer:
137+
use_lora=True, lora_rank=16
138+
# In MotionDeformableAttention:
139+
use_lora=True, lora_rank=16
140+
```
141+
142+
### RLFT Variant Comparison
143+
144+
| Config | Data Source | Dataset Type | `normal_only` | `hard_case_no_imi` | Special Fields |
145+
|--------|-----------|-------------|--------------|-------------------|----------------|
146+
| `rlft_common_log` | Real logs | `FineTune` | `True` | `False` (default) | - |
147+
| `rlft_rare_log` | Real logs | `FineTune` | `False` (default) | `True` | - |
148+
| `rlft_rare_rollout` | Synthetic | `FineTuneSynthetic` | - | - | `folder_name`, `customized_filter="v1"` |
149+
| `rlft_rare_rollout_bwm` | Synthetic (multi) | `FineTuneSynthetic` | - | - | `folder_name` (4 sources), `customized_filter="v1"` |
150+
| `rlft_rare_syn_replay` | Synthetic replay | `FineTuneSynthetic` | - | - | `customized_filter="v2"`, `include_real_failures=False` |
151+
152+
---
153+
154+
### rlft_common_log
155+
156+
**Purpose:** Ablation baseline -- RL fine-tuning using only **normal** (non-failure) data.
157+
158+
Key fields:
159+
- `normal_only=True` -- dataset only loads normal samples, excluding hard/failure cases
160+
- `hard_case_no_imi=False` (default) -- N/A since there are no hard cases in the data
161+
162+
**Downstream effect:** In `navsim_openscene_finetuning.py`, when `normal_only=True`, the dataset's `index_map` only contains normal samples. No failure cases from `finetune_yaml` enter the training loop.
163+
164+
---
165+
166+
### rlft_rare_log
167+
168+
**Purpose:** Full RLFT on rare/hard failure scenarios from **real driving logs**.
169+
170+
Key fields:
171+
- `normal_only=False` (default) -- dataset mixes normal + failure samples
172+
- `hard_case_no_imi=True` -- for hard cases (`fail_mask != 0`), imitation learning loss is zeroed out; only RL losses (PG + entropy) are applied
173+
174+
**Downstream effect:** In `traj_scoring_head_RL.py`, when `hard_case_no_imi=True`, the imitation mask is set to 0 for all samples where `fail_mask != 0` (both real failures and synthetic cases). This forces the model to learn from RL rewards rather than imitating expert behavior on difficult scenarios.
175+
176+
---
177+
178+
### rlft_rare_rollout
179+
180+
**Purpose:** RLFT using **synthetic rollout** trajectories generated from a single source.
181+
182+
Key fields:
183+
- `train_dataset_type = "NavSimOpenSceneE2EFineTuneSynthetic"` -- loads synthetic trajectory data
184+
- `synthetic_folder_names = ["e2e_vadv2_50pct_navtrain_50pct_collision_NR_250911"]` -- single synthetic source
185+
- `customized_filter="v1"` -- filtering strategy for synthetic data
186+
- `folder_name=synthetic_folder_names` -- passed to dataset for loading
187+
188+
---
189+
190+
### rlft_rare_rollout_bwm
191+
192+
**Purpose:** Extended version of `rare_rollout` with **multiple augmented synthetic sources** (backward-masked trajectories).
193+
194+
Key fields:
195+
- `synthetic_folder_names` -- 4 sources covering collision, ego progress, and off-road scenarios:
196+
```python
197+
synthetic_folder_names = [
198+
"e2e_vadv2_50pct_navtrain_50pct_collision_NR_250911",
199+
"e2e_vadv2_50pct_aug_navtrain_50pct_collision_NR_250928",
200+
"e2e_vadv2_50pct_aug_navtrain_50pct_ep_1pct_NR_250928",
201+
"e2e_vadv2_50pct_aug_navtrain_50pct_offroad_NR_250928",
202+
]
203+
```
204+
PS: You need to produce your own rollouts and organize them into `data/alg_engine/openscene-synthetic`. See [SimEngine Usage Guide - Rollout Scripts](simengine_usage.md#rollout-scripts) for how to generate augmented rollouts.
205+
- `customized_filter="v1"` -- same filtering as `rare_rollout`
206+
207+
---
208+
209+
### rlft_rare_syn_replay
210+
211+
**Purpose:** RLFT with synthetic replay data, **excluding real failure cases**.
212+
213+
Key fields:
214+
- `customized_filter="v2"` -- different filtering strategy from v1
215+
- `include_real_failures=False` -- explicitly excludes real failure data, training only on synthetic replays
216+
- Uses same multi-source `synthetic_folder_names` as `rare_rollout_bwm` but with single source
217+
218+
---
219+
220+
## IL Fine-Tuning Config (ILFT)
221+
222+
**File:** `e2e_vadv2_50pct_ilft_rare_log.py`
223+
224+
**Purpose:** Fine-tune with **Imitation Learning only** (no RL) on rare failure cases. Serves as an ablation to compare against RLFT approaches.
225+
226+
### Key Differences from RLFT
227+
228+
| Parameter | RLFT | ILFT |
229+
|-----------|------|------|
230+
| `rl_finetuning` | `True` | `False` |
231+
| `reward_shaping` | `True` | `False` |
232+
| `rl_loss_weight` | `dict(bce=0, rank=0, PG=0.01, entropy=1.0)` | N/A |
233+
| `orig_IL` | `True` | N/A |
234+
| `evaluation.interval` | 8 | 1 (every epoch) |
235+
236+
The model still uses `TrajScoringHeadRL` as the head type (for code compatibility) and LoRA adapters, but all RL-specific losses are disabled. The model learns purely from imitation.
237+
238+
---
239+
240+
## Detection/Tracking Config
241+
242+
**File:** `track_map_nuplan_r50_navtrain_50pct.py`
243+
244+
**Purpose:** Train perception-only model for **object detection + map segmentation** (no planning).
245+
246+
### Key Differences from E2E Configs
247+
248+
| Parameter | E2E (NAVFormer) | Detection (UniAD) |
249+
|-----------|----------------|-------------------|
250+
| `model.type` | `NAVFormer` | `UniAD` |
251+
| `dataset_type` | `NavSimOpenSceneE2E` | `NavSimOpenSceneE2EDet` |
252+
| `queue_length` | 4 | 3 |
253+
| `total_epochs` | 8 | 40 |
254+
| `samples_per_gpu` | 2 | 1 |
255+
| `freeze_*` | varies | all `False` (full training) |
256+
| `planning_head` | yes | no |
257+
| `seg_head` | no | yes (`PansegformerHead`) |
258+
| `eval_mod` | `[]` | `['det', 'map']` |
259+
| `load_from` | varies | `bevformerv2-r50-t1-base_epoch_48.pth` |
260+
261+
Additional features in detection config:
262+
- **Segmentation head** (`PansegformerHead`): lane detection and map segmentation
263+
- **3D annotations**: `gt_bboxes_3d`, `gt_labels_3d`, `gt_lane_labels`, `gt_lane_bboxes`, `gt_lane_masks`
264+
- **Image scaling**: `RandomScaleImageMultiViewImage` with scale 0.5
265+
- **Loading**: `LoadMultiViewImageFromFilesInCeph` (vs `LoadMultiViewImageFromFilesWithDownsample` in E2E)
266+
267+
---
268+
269+
## Common Parameters Reference
270+
271+
### Spatial & BEV Settings
272+
273+
| Parameter | Value | Description |
274+
|-----------|-------|-------------|
275+
| `point_cloud_range` | `[-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]` | 3D detection range (meters) |
276+
| `voxel_size` | `[0.2, 0.2, 8]` | Voxel grid resolution |
277+
| `bev_h_, bev_w_` | `200, 200` | BEV feature map size |
278+
| `patch_size` | `[102.4, 102.4]` | Spatial patch size for BEV |
279+
280+
### Temporal Settings
281+
282+
| Parameter | Value | Description |
283+
|-----------|-------|-------------|
284+
| `queue_length` | 4 (E2E) / 3 (Det) | Number of frames per sequence |
285+
| `past_steps` | 3 | Historical tracking steps |
286+
| `fut_steps` | 4 | Future prediction steps |
287+
| `planning_steps` | 8 | Planning horizon steps |
288+
289+
### Model Architecture
290+
291+
| Parameter | Value | Description |
292+
|-----------|-------|-------------|
293+
| `_dim_` | 256 | Embedding dimension |
294+
| `_ffn_dim_` | 512 | Feed-forward network dimension |
295+
| `_num_levels_` | 4 | Multi-scale feature levels |
296+
| `num_query` | 900 | Number of detection queries |
297+
| `num_cams` | 8 | Number of camera views |
298+
299+
### Tracking (QIM & Memory Bank)
300+
301+
| Parameter | Value | Description |
302+
|-----------|-------|-------------|
303+
| `qim_type` | `QIMBase` | Query interaction module type |
304+
| `fp_ratio` | 0.3 | False positive ratio |
305+
| `random_drop` | 0.1 | Random query drop rate |
306+
| `memory_bank_len` | 4 | Frames to keep in memory bank |
307+
308+
### Optimizer & Schedule
309+
310+
| Parameter | Value | Description |
311+
|-----------|-------|-------------|
312+
| `optimizer.type` | `AdamW` | Optimizer type |
313+
| `optimizer.lr` | 2e-4 | Base learning rate |
314+
| `img_backbone lr_mult` | 0.1 | Backbone learning rate multiplier |
315+
| `weight_decay` | 0.01 | Weight decay |
316+
| `lr_config.policy` | `CosineAnnealing` | LR schedule |
317+
| `warmup_iters` | 500 | Linear warmup iterations |
318+
| `grad_clip.max_norm` | 35 | Gradient clipping threshold |
319+
320+
### Freeze Strategy
321+
322+
| Parameter | Base IL | RLFT/ILFT | Detection |
323+
|-----------|---------|-----------|-----------|
324+
| `freeze_img_backbone` | `True` | `True` | `False` |
325+
| `freeze_img_neck` | `False` | `True` | `False` |
326+
| `freeze_bn` | `False` | `True` | `False` |
327+
| `freeze_bev_encoder` | `False` | `True` | `False` |
328+
329+
---
330+
331+
## Config Comparison Table
332+
333+
### All E2E Configs at a Glance
334+
335+
| Config | Model | Head | Dataset | Freeze | LoRA | RL | Epochs | Eval Interval | Pretrained |
336+
|--------|-------|------|---------|--------|------|----|--------|--------------|------------|
337+
| `e2e_vadv2_Xpct` | NAVFormer | TrajScoringHead | NavSimOpenSceneE2E | backbone only | No | No | 8 | 8 | perception ckpt |
338+
| `rlft_common_log` | NAVFormer | TrajScoringHeadRL | FineTune | all except planning | Yes | Yes | 8 | 8 | IL e2e ckpt |
339+
| `rlft_rare_log` | NAVFormer | TrajScoringHeadRL | FineTune | all except planning | Yes | Yes | 8 | 8 | IL e2e ckpt |
340+
| `rlft_rare_rollout` | NAVFormer | TrajScoringHeadRL | FineTuneSynthetic | all except planning | Yes | Yes | 8 | 8 | IL e2e ckpt |
341+
| `rlft_rare_rollout_bwm` | NAVFormer | TrajScoringHeadRL | FineTuneSynthetic | all except planning | Yes | Yes | 8 | 8 | IL e2e ckpt |
342+
| `rlft_rare_syn_replay` | NAVFormer | TrajScoringHeadRL | FineTuneSynthetic | all except planning | Yes | Yes | 8 | 8 | IL e2e ckpt |
343+
| `ilft_rare_log` | NAVFormer | TrajScoringHeadRL | FineTune | all except planning | Yes | No | 8 | 1 | IL e2e ckpt |
344+
| `track_map` | UniAD | N/A (det only) | NavSimOpenSceneE2EDet | none | No | No | 40 | 40 | BEVFormerV2 ckpt |
345+
346+
### Failure Data Sources
347+
348+
The `finetune_yaml` files define which failure scenarios are included:
349+
350+
| YAML File | Failure Type |
351+
|-----------|-------------|
352+
| `navtrain_50pct_collision.yaml` | Collision scenarios |
353+
| `navtrain_50pct_ep_1pct.yaml` | Bottom 1% ego progress (near-stationary) |
354+
| `navtrain_50pct_off_road.yaml` | Off-road / drivable area violations |

0 commit comments

Comments
 (0)