You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rename the package before its initial PyPI release for a catchier,
more memorable name aligned with projects like roboflow and robocorp.
Changes span:
- pyproject.toml: package name, optional deps, wheel config
- src/robot_harness/ → src/roboharness/: directory and all imports
- All test, example, and documentation files updated
- CI workflows, issue templates, and SVG assets updated
Closes#21https://claude.ai/code/session_01LA2Uo7hCVdpo5gjGXief4H
Copy file name to clipboardExpand all lines: README.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
<divalign="center">
2
2
3
-
# Robot-Harness
3
+
# Roboharness
4
4
5
5
**A Visual Testing Harness for AI Coding Agents in Robot Simulation**
6
6
@@ -20,26 +20,26 @@
20
20
21
21
</div>
22
22
23
-
## What is Robot-Harness?
23
+
## What is Roboharness?
24
24
25
-
Robot-Harness is a framework that lets AI Coding Agents (Claude Code, OpenAI Codex, [OpenClaw](https://github.com/openclaw/openclaw), etc.) control robot simulations through a **visual feedback loop**:
25
+
Roboharness is a framework that lets AI Coding Agents (Claude Code, OpenAI Codex, [OpenClaw](https://github.com/openclaw/openclaw), etc.) control robot simulations through a **visual feedback loop**:
**Key insight**: Modern coding agents are already multimodal — they can write code AND see images AND make decisions. We don't need a separate VLM. Robot-Harness just needs to present simulation visuals in a format agents can directly consume.
31
+
**Key insight**: Modern coding agents are already multimodal — they can write code AND see images AND make decisions. We don't need a separate VLM. Roboharness just needs to present simulation visuals in a format agents can directly consume.
32
32
33
33
## Installation
34
34
35
35
```bash
36
-
pip install robot-harness
36
+
pip install roboharness
37
37
38
38
# With MuJoCo + Meshcat backend
39
-
pip install robot-harness[mujoco]
39
+
pip install roboharness[mujoco]
40
40
41
41
# Development
42
-
pip install robot-harness[dev]
42
+
pip install roboharness[dev]
43
43
```
44
44
45
45
## Quick Start
@@ -49,7 +49,7 @@ pip install robot-harness[dev]
49
49
Run a complete grasp simulation with zero external dependencies:
50
50
51
51
```bash
52
-
pip install robot-harness[mujoco] Pillow
52
+
pip install roboharness[mujoco] Pillow
53
53
python examples/mujoco_grasp.py --report
54
54
```
55
55
@@ -73,7 +73,7 @@ Wrap any Gymnasium-compatible environment with one line:
│ ├── harness.py # Main Harness class + SimulatorBackend protocol
183
183
│ ├── checkpoint.py # Checkpoint management & state snapshots
@@ -218,7 +218,7 @@ We especially welcome:
218
218
- Real-world usage examples
219
219
- Integration with popular RL libraries (SB3, CleanRL, etc.)
220
220
221
-
**AI agents are welcome contributors!** We actively encourage contributions from AI coding agents such as [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [OpenAI Codex](https://github.com/openai/codex), [OpenClaw](https://github.com/openclaw/openclaw), and other autonomous coding tools. If your agent can improve Robot-Harness, send a PR!
221
+
**AI agents are welcome contributors!** We actively encourage contributions from AI coding agents such as [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [OpenAI Codex](https://github.com/openai/codex), [OpenClaw](https://github.com/openclaw/openclaw), and other autonomous coding tools. If your agent can improve Roboharness, send a PR!
> **Purpose**: This is the complete context document for the robot-harness project, intended as a reference for Claude Code, Codex, and other AI Agents during code review, architecture design, and feature development.
4
+
> **Purpose**: This is the complete context document for the roboharness project, intended as a reference for Claude Code, Codex, and other AI Agents during code review, architecture design, and feature development.
5
5
6
6
## Part 1: Project Overview and Motivation
7
7
8
-
### 1.1 What is Robot-Harness
8
+
### 1.1 What is Roboharness
9
9
10
-
Robot-Harness is a **visual testing framework for AI Coding Agents in robot simulation**. Its core goal is to enable Claude Code, OpenAI Codex, and similar coding agents to:
10
+
Roboharness is a **visual testing framework for AI Coding Agents in robot simulation**. Its core goal is to enable Claude Code, OpenAI Codex, and similar coding agents to:
2.**Capture multi-view screenshots** — acquire RGB/depth images from different camera positions at the same simulation moment
14
14
3.**Autonomously judge task results** — agent directly observes screenshots to determine whether motion is reasonable, grasps are successful, etc.
15
15
4.**Iteratively optimize algorithms** — based on visual judgment results, the agent autonomously modifies control code and reruns
16
16
17
-
**Fundamental difference from traditional approaches**: We don't need a separate VLM model for visual evaluation. Claude Code and Codex themselves are multimodal agents — they can write code, see images, and make decisions. Robot-Harness's responsibility is to **efficiently present simulation visual information in a format that agents can directly consume**.
17
+
**Fundamental difference from traditional approaches**: We don't need a separate VLM model for visual evaluation. Claude Code and Codex themselves are multimodal agents — they can write code, see images, and make decisions. Roboharness's responsibility is to **efficiently present simulation visual information in a format that agents can directly consume**.
18
18
19
19
### 1.2 Core Use Case
20
20
21
21
Taking a grasping task as an example, the complete Agent-in-the-loop workflow is:
22
22
23
23
1. Agent writes/modifies grasp control code
24
-
2.Robot-Harness runs the simulation, automatically pausing at predefined checkpoints (plan start, plan end, contact point, lift complete)
24
+
2.Roboharness runs the simulation, automatically pausing at predefined checkpoints (plan start, plan end, contact point, lift complete)
25
25
3. At each checkpoint, the Harness captures screenshots from multiple viewpoints and saves them as files
26
26
4. Agent examines screenshots + structured state data, judging whether the current phase is normal
27
27
5. If problems are found, the agent modifies code and reruns from the appropriate checkpoint
@@ -165,7 +165,7 @@ This workflow is running and producing real results, but is currently a custom i
165
165
-**VLM-RMs** (ICLR 2024): CLIP cosine similarity as zero-shot reward signal
Copy file name to clipboardExpand all lines: docs/simulator-survey.en.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
# Technical Survey of Open-Source Robot Control Projects and Robot-Harness Integration Analysis
1
+
# Technical Survey of Open-Source Robot Control Projects and Roboharness Integration Analysis
2
2
3
3
> The Gymnasium Wrapper approach can cover 60% of mainstream projects, but two major ecosystems (legged_gym family and JAX/Brax family) require dedicated adapters.
4
4
@@ -45,7 +45,7 @@ Video recording is supported via `gymnasium.wrappers.RecordVideo` (requires `--e
45
45
46
46
Known issues: headless mode rendering hangs (#324), camera-enabled environments not rendering (#3250), recordings missing debug markers (#2233), WebRTC errors in Docker (#3192).
47
47
48
-
### Robot-Harness Integration Feasibility: Very High
48
+
### Roboharness Integration Feasibility: Very High
49
49
50
50
The standard Gymnasium Wrapper approach is directly applicable:
51
51
@@ -77,7 +77,7 @@ The configuration system uses nested Python classes (neither dataclasses nor YAM
77
77
78
78
**No automated tests, no CI/CD** — this is a notable weakness of the project.
1.**Vectorized vs. single-instance**: Isaac Gym runs 4096+ parallel environments simultaneously; Gymnasium expects a single instance
@@ -102,7 +102,7 @@ Entirely based on JAX functional programming. Environments are pure functions: `
102
102
103
103
Batch parallelism is achieved via `jax.vmap()`, and training loops complete full rollouts on-device using `jax.lax.scan`. Single-threaded MJX on GPU is 10x slower than CPU MuJoCo — its advantage comes entirely from massive parallelism (batch sizes of 1024–8192+).
### Robot-Harness Integration Feasibility: Very High
140
+
### Roboharness Integration Feasibility: Very High
141
141
142
142
The CPUGymWrapper mode is directly compatible with the standard Gymnasium Wrapper. The ManiSkillVectorEnv mode requires handling batched GPU data, similar to Isaac Lab. The RecordEpisode wrapper provides built-in recording functionality.
0 commit comments