Skip to content

Commit 7c0026b

Browse files
committed
[DOCS] Add figure sources
1 parent 70b7ebb commit 7c0026b

File tree

2 files changed

+12
-2
lines changed

2 files changed

+12
-2
lines changed

robotics-ai-suite/docs/embodied/sample_pipelines/pi05_with_rtc.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ Key structural features of π₀.₅ include:
1717
:width: 85%
1818
:align: center
1919

20+
*(Figure source:* `Paper <https://arxiv.org/abs/2504.16054>`_ *π₀.₅: a Vision-Language-Action Model with Open-World Generalization)*
21+
2022
Real-Time Chunking (RTC) is an inference strategy designed to enable high-frequency robotic control with high-latency flow-matching policies (e.g., Pi0, Pi0.5). Based on the application of asynchronous inference execution, RTC employs a unique **Prefix Guidance** mechanism during inference. Instead of blending overlapping chunks after generation (temporal ensembling), RTC uses the unexecuted portion of the previous chunk as a constraint during the flow-matching process. By treating the transition as an inpainting problem, the model is guided to generate new trajectories that seamlessly extend the current motion, ensuring continuous control.
2123

2224
The synergy between Pi0.5 and RTC enables sophisticated generalist control on standard hardware by addressing two critical problems of standard VLA models: **Action Waiting** and **Action Jumping**.
@@ -28,6 +30,8 @@ The synergy between Pi0.5 and RTC enables sophisticated generalist control on st
2830
:width: 85%
2931
:align: center
3032

33+
*(Figure source:* `Paper <https://arxiv.org/abs/2506.07339>`_ *Real-Time Execution of Action Chunking Flow Policies)*
34+
3135
This project demonstrates an implementation of Pi0.5 + RTC using the OpenVINO toolkit, specifically accelerating inference on Intel platforms. It provides a comprehensive end-to-end pipeline, covering both MuJoCo simulation for policy validation and a modular workflow for deployment on real ALOHA robots.
3236

3337
Installation

robotics-ai-suite/pipelines/pi05-rtc-ov/README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,21 @@ Key structural features of π₀.₅ include:
88
* **Discretized State Tokenization**: Robot proprioceptive state is discretized and treated as text tokens within the input prefix, allowing the model to "read" its physical state using the same attention mechanisms as natural language.
99
* **Unified Prefix Processing**: Visual patch tokens from SigLIP and text tokens are concatenated into a single sequence, which the transformer processes holistically before passing context to the Action Expert.
1010

11-
![Pi0.5 Overview](README.assets/pi05-overview.png)
11+
<p align="center">
12+
<img src="README.assets/pi05-overview.png" alt="Pi0.5 Overview"><br>
13+
<em>Figure source: <a href="https://arxiv.org/abs/2504.16054">Paper</a> π₀.₅: a Vision-Language-Action Model with Open-World Generalization</em>
14+
</p>
1215

1316
Real-Time Chunking (RTC) is an inference strategy designed to enable high-frequency robotic control with high-latency flow-matching policies (e.g., Pi0, Pi0.5). Based on the application of asynchronous inference execution, RTC employs a unique **Prefix Guidance** mechanism during inference. Instead of blending overlapping chunks after generation (temporal ensembling), RTC uses the unexecuted portion of the previous chunk as a constraint during the flow-matching process. By treating the transition as an inpainting problem, the model is guided to generate new trajectories that seamlessly extend the current motion, ensuring continuous control.
1417

1518
The synergy between Pi0.5 and RTC enables sophisticated generalist control on standard hardware by addressing two critical problems of standard VLA models: **Action Waiting** and **Action Jumping**.
1619
1. **Eliminating Action Waiting**: RTC runs inference asynchronously in the background while the robot executes buffered actions. This ensures the robot never pauses to "think," maintaining high-frequency control (e.g., 50Hz) despite the model's lower inference speed.
1720
2. **Preventing Action Jumping**: Through **Prefix Guidance**, RTC treats trajectory generation as an inpainting task. It constrains the start of the new plan to align perfectly with the unexecuted tail of the previous plan, enforcing continuity at the generation level rather than relying on post-hoc smoothing.
1821

19-
![RTC Overview](README.assets/RTC-overview.png)
22+
<p align="center">
23+
<img src="README.assets/RTC-overview.png" alt="RTC Overview"><br>
24+
<em>Figure source: <a href="https://arxiv.org/abs/2506.07339">Paper</a> Real-Time Execution of Action Chunking Flow Policies</em>
25+
</p>
2026

2127
This project demonstrates an implementation of Pi0.5 + RTC using the OpenVINO toolkit, specifically accelerating inference on Intel platforms. It provides a comprehensive end-to-end pipeline, covering both MuJoCo simulation for policy validation and a modular workflow for deployment on real ALOHA robots.
2228

0 commit comments

Comments
 (0)