Skip to content
This repository was archived by the owner on Nov 12, 2025. It is now read-only.

Commit 3670ed6

Browse files
HaomingSongCopilot
andauthored
Real world franka example code (#11)
* finfish real world franka example code * format code * format code * Update experiments/7_franka/eval_franka.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: HaomingSong <haomingsong@sjtu.edu.cn> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 3a8c69b commit 3670ed6

8 files changed

Lines changed: 498 additions & 4 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ tools/caption_video.html
233233
experiments/2_libero/logs
234234
experiments/2_libero/logs/videos
235235
demo_data/demos25
236+
**/videos/**
236237

237238
demo_data/libero_spatial_no_noops_1.0.0_lerobot
238239
experiments/test

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "experiments/7_franka/deoxys_control"]
2+
path = experiments/7_franka/deoxys_control
3+
url = https://github.com/UT-Austin-RPL/deoxys_control.git

experiments/7_franka/README.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Franka Emika Panda Robot Control with EO-1
2+
3+
This directory contains the implementation for controlling Franka Emika Panda robots using the EO-1 model. The system enables real-time robot manipulation tasks through vision-language-action integration.
4+
5+
## 🚀 Quick Start
6+
7+
### Prerequisites
8+
9+
**Hardware Requirements:**
10+
11+
- Franka Emika Panda robot arm
12+
- RealSense cameras (or compatible RGB cameras)
13+
- **NUC**: Configured with real-time kernel for robot control
14+
- **Workstation**: Equipped with GPU for model inference
15+
16+
**Software Requirements:**
17+
18+
- Ubuntu 20.04+ with CUDA support
19+
- Python 3.10+
20+
- Real-time kernel configuration on NUC
21+
- Deoxys control system properly configured
22+
23+
### Installation
24+
25+
1. **Setup submodules:**
26+
27+
```bash
28+
git submodule update --init --recursive experiments/7_franka/deoxys_control
29+
```
30+
31+
2. **Configure robot control system:**
32+
Follow the [Deoxys Documentation](https://zhuyifengzju.github.io/deoxys_docs/html/index.html) to configure your NUC and workstation for Franka control.
33+
34+
3. **Install dependencies on workstation**
35+
36+
```bash
37+
# Create conda environment
38+
conda create -n eo python=3.10
39+
conda activate eo
40+
41+
# Install deoxys for workstation
42+
pip install -e experiments/7_franka/deoxys_control/deoxys
43+
44+
# Install additional requirements
45+
pip install -r experiments/7_franka/requirements.txt
46+
```
47+
48+
**Note**: The NUC handles real-time robot control while the workstation runs the EO-1 model inference. Both systems must be properly configured according to the Deoxys documentation.
49+
50+
## 🤖 Running Robot Control
51+
52+
### Basic Usage
53+
54+
```bash
55+
python experiments/7_franka/eval_franka.py \
56+
--model-path "path/to/your/model" \
57+
--repo-id libero_spatial_no_noops_1.0.0_lerobot \
58+
--task "Pick and place a cube" \
59+
--video-out-path experiments/7_franka/videos \
60+
--max-timesteps 300
61+
```
62+
63+
### Parameters
64+
65+
| Parameter | Description | Default |
66+
| ------------------ | -------------------------------------------- | --------------------------------------- |
67+
| `--model-path` | Path to the trained EO-1 model checkpoint | Required |
68+
| `--repo-id` | Dataset repository ID for task specification | `libero_spatial_no_noops_1.0.0_lerobot` |
69+
| `--task` | Natural language description of the task | `"Pick and place a cube"` |
70+
| `--video-out-path` | Directory to save recorded videos | `experiments/7_franka/videos` |
71+
| `--max-timesteps` | Maximum number of control steps | `300` |
72+
| `--resize-size` | Image resize dimensions for model input | `224` |
73+
| `--replan-steps` | RHC control horizon | `5` |
74+
75+
### Camera Configuration
76+
77+
The system supports multiple camera inputs. Update the camera serial numbers in `eval_franka.py`:
78+
79+
```python
80+
# Camera serial numbers (update these with your actual camera IDs)
81+
EGO_CAMERA = "213522070137" # Wrist camera
82+
THIRD_CAMERA = "243222074139" # External camera
83+
```
84+
85+
## 🔒 Safety Considerations
86+
87+
- Always ensure proper workspace setup before operation
88+
- Monitor robot movements and be ready to use emergency stop
89+
- Verify camera positioning for optimal visual coverage
90+
91+
## 📝 Notes
92+
93+
- The system requires both wrist and external cameras for optimal performance
94+
- Model performance depends on lighting conditions and camera positioning
95+
- Regular calibration of the robot and cameras is recommended
96+
- Check the video output directory for recorded demonstrations
Submodule deoxys_control added at 97396fd
Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
import os
2+
3+
os.environ["TOKENIZERS_PARALLELISM"] = "false"
4+
5+
import collections
6+
import copy
7+
import dataclasses
8+
import os.path as osp
9+
import time
10+
from datetime import datetime
11+
from pathlib import Path
12+
13+
import cv2
14+
import deoxys.utils.transform_utils as dft
15+
import imageio
16+
import numpy as np
17+
import torch
18+
import tqdm
19+
import tyro
20+
from deoxys import config_root
21+
from deoxys.experimental.motion_utils import reset_joints_to
22+
from deoxys.franka_interface import FrankaInterface
23+
from deoxys.utils import YamlConfig
24+
from PIL import Image
25+
from realsense_camera import MultiCamera
26+
from transformers import AutoModel, AutoProcessor
27+
28+
# Add your camera serial numbers here
29+
EGO_CAMERA = "213522070137"
30+
THIRD_CAMERA = "243222074139"
31+
32+
reset_joint_positions = [
33+
0.0760389047913384,
34+
-1.0362613022620384,
35+
-0.054254247684777324,
36+
-2.383951857286591,
37+
-0.004505598470154735,
38+
1.3820559157131187,
39+
0.784935455988679,
40+
]
41+
42+
43+
def save_rollout_video(rollout_images, save_dir):
44+
"""Saves an MP4 replay of an episode."""
45+
date_time = time.strftime("%Y_%m_%d-%H_%M_%S")
46+
mp4_path = Path(save_dir) / f"{date_time}.mp4"
47+
video_writer = imageio.get_writer(mp4_path, fps=5)
48+
for img in rollout_images:
49+
video_writer.append_data(img)
50+
video_writer.close()
51+
print(f"Saved rollout MP4 at path {mp4_path}")
52+
53+
54+
@dataclasses.dataclass
55+
class Args:
56+
#################################################################################################################
57+
# Model parameters
58+
#################################################################################################################
59+
resize_size: int = 224
60+
replan_steps: int = 5
61+
model_path: str = ""
62+
repo_id: str = ""
63+
task: str = ""
64+
65+
#################################################################################################################
66+
# Utils
67+
#################################################################################################################
68+
video_out_path: Path = Path("experiments/7_franka/videos") # Path to save videos
69+
max_timesteps: int = 300 # Number of timesteps to run
70+
71+
72+
def convert_gripper_action(action):
73+
action[-1] = 1 - action[-1]
74+
if action[-1] < 0.5:
75+
action[-1] = -1
76+
77+
return action
78+
79+
80+
def get_robot_interface():
81+
robot_interface = FrankaInterface(osp.join(config_root, "charmander.yml"))
82+
controller_cfg = YamlConfig(osp.join(config_root, "osc-pose-controller.yml")).as_easydict()
83+
controller_type = "OSC_POSE"
84+
85+
return robot_interface, controller_cfg, controller_type
86+
87+
88+
def main(args: Args):
89+
multi_camera = MultiCamera()
90+
robot_interface, controller_cfg, controller_type = get_robot_interface()
91+
92+
model = (
93+
AutoModel.from_pretrained(args.model_path, dtype=torch.bfloat16, trust_remote_code=True).eval().cuda()
94+
)
95+
96+
processor = AutoProcessor.from_pretrained(args.model_path, trust_remote_code=True)
97+
98+
while True:
99+
action_plan = collections.deque()
100+
input("Press Enter to start episode ...")
101+
reset_joints_to(robot_interface, reset_joint_positions)
102+
103+
replay_images = []
104+
bar = tqdm.tqdm(
105+
range(args.max_timesteps),
106+
position=0,
107+
leave=True,
108+
ncols=80,
109+
desc="Rollout steps",
110+
)
111+
112+
for _ in bar:
113+
try:
114+
images = multi_camera.get_frame()
115+
last_state = robot_interface._state_buffer[-1]
116+
last_gripper_state = robot_interface._gripper_state_buffer[-1]
117+
frame, _ = images[THIRD_CAMERA]
118+
ego_frame, _ = images[EGO_CAMERA]
119+
120+
if not action_plan:
121+
frame = copy.deepcopy(frame)
122+
ego_frame = copy.deepcopy(ego_frame)
123+
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
124+
ego_frame = cv2.cvtColor(ego_frame, cv2.COLOR_BGR2RGB)
125+
replay_images.append(frame)
126+
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
127+
ego_frame_rgb = cv2.cvtColor(ego_frame, cv2.COLOR_BGR2RGB)
128+
replay_images.append(frame_rgb.copy())
129+
130+
eef_pose = np.asarray(last_state.O_T_EE, dtype=np.float32).reshape(4, 4).T
131+
eef_state = np.concatenate(
132+
(
133+
eef_pose[:3, -1],
134+
dft.mat2euler(
135+
eef_pose[:3, :-1],
136+
),
137+
),
138+
axis=-1,
139+
)
140+
gripper_state = np.array([last_gripper_state.width])
141+
state_data = np.concatenate([eef_state.flatten(), np.array([0]), gripper_state])
142+
143+
frame_resized = cv2.resize(frame_rgb, (args.resize_size, args.resize_size))
144+
ego_frame_resized = cv2.resize(ego_frame_rgb, (args.resize_size, args.resize_size))
145+
146+
ego_frame = Image.fromarray(ego_frame_resized)
147+
frame = Image.fromarray(frame_resized)
148+
149+
# NOTE: Change the keys to match your model
150+
batch = {
151+
"observation.images.image": [frame],
152+
"observation.images.wrist_image": [ego_frame],
153+
"observation.state": [state_data],
154+
"task": [args.task],
155+
"repo_id": [args.repo_id],
156+
}
157+
ov_out = processor.select_action(
158+
model,
159+
batch,
160+
)
161+
action_chunk = ov_out.action[0].numpy()
162+
assert len(action_chunk) >= args.replan_steps, (
163+
f"We want to replan every {args.replan_steps} steps, but policy only predicts {len(action_chunk)} steps."
164+
)
165+
action_plan.extend(action_chunk[: args.replan_steps])
166+
167+
pred_action_chunk = action_plan.popleft()
168+
action = pred_action_chunk
169+
rotation_matrix = dft.euler2mat(action[3:6])
170+
quat = dft.mat2quat(rotation_matrix)
171+
axis_angle = dft.quat2axisangle(quat)
172+
action[3:6] = axis_angle
173+
action = convert_gripper_action(action)
174+
175+
robot_interface.control(
176+
controller_type=controller_type, action=action, controller_cfg=controller_cfg
177+
)
178+
179+
except KeyboardInterrupt:
180+
break
181+
182+
# saving video
183+
video_save_path = args.video_out_path / args.task.replace(" ", "_")
184+
video_save_path.mkdir(parents=True, exist_ok=True)
185+
curr_time = datetime.now().strftime("%Y_%m_%d_%H_%M_%S")
186+
save_path = video_save_path / f"{curr_time}.mp4"
187+
video = np.stack(replay_images)
188+
imageio.mimsave(save_path, video, fps=20)
189+
190+
if input("Do one more eval (default y)? [y/n]").lower() == "n":
191+
break
192+
193+
194+
if __name__ == "__main__":
195+
args = tyro.cli(Args)
196+
main(args)

0 commit comments

Comments
 (0)