GitHub - Physical-Intelligence/real-time-chunking-kinetix: Simulated experiments for "Real-Time Execution of Action Chunking Flow Policies".

Simulated experiments for the papers Real-Time Execution of Action Chunking Flow Policies and Training-Time Action Conditioning for Efficient Real-Time Chunking.

Installation

# Clone Kinetix submodule
git submodule update --init
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync

Pre-trained checkpoints and data

gs://rtc-assets/expert/ contains expert checkpoints generated by src/train_expert.py, and gs://rtc-assets/expert/data/ contains million-transition datasets for each level (generated by src/generate_data.py). Be aware that the expert/ directory is about 60GiB in total.

gs://rtc-assets/bc/ contains imitation learning policies for each level trained on the aforementioned data (generated by src/train_flow.py). These are directly usable with src/eval_flow.py.

Reproduce results

Note that, for all scripts, your number of GPUs must divide the number of levels (default 12) because computation is sharded over levels.

Train expert policies: uv run src/train_expert.py
- By default, this will train 8 seeds per level for 65 million environment steps each.
- Checkpoints, videos, and stats are written to a wandb project called rtc-kinetix-expert and the local directory ./logs-expert/<wandb-run-name>. It is recommended to control other wandb options, like the run name, using environment variables.
Generate data: uv run src/generate_data.py --config.run-path ./logs-expert/<wandb-run-name>
- For each level, this will automatically load the best-performing checkpoint for each seed (discarding seeds that didn't reach a certain success threshold).
- By default, 1 million environment steps are collected for each level using a mixture of expert policies.
- Data is written back to ./logs-expert/<wandb-run-name>/data/.
Train imitation learning policies: uv run src/train_flow.py --config.run-path ./logs-expert/<wandb-run-name>
- This will load the data from step 2 and train flow matching policies for each level.
- Checkpoints, videos, and stats are written to a wandb project called rtc-kinetix-bc and the local directory ./logs-bc/<wandb-run-name>. It is recommended to control other wandb options, like the run name, using environment variables.
Evaluate imitation learning policies: uv run src/eval_flow.py --config.run-path ./logs-bc/<wandb-run-name> --output-dir <output-dir>
- This will load the checkpoints from step 3 and evaluate them for 2048 trials per level by default.
- Currently, the script performs an exhaustive sweep over inference delay and execution horizon for all methods.

Training-Time RTC

To reproduce the results for training-time RTC, follow the following steps:

Change simulated_delay in the model config to 5.
Fine-tune the pre-trained checkpoint with simulated delay for 8 epochs: uv run src/train_flow.py --config.run-path <run_path> --config.load-dir bc/24 --config.num-epochs 8 where bc is the contents of gs://rtc-assets/bc/.
Evaluate as above.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
third_party		third_party
worlds/l		worlds/l
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Pre-trained checkpoints and data

Reproduce results

Training-Time RTC

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Installation

Pre-trained checkpoints and data

Reproduce results

Training-Time RTC

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages