Skip to content

Commit 3347c80

Browse files
committed
Merge branch 'main' into kevin
2 parents 31e8878 + 4b6f2ae commit 3347c80

43 files changed

Lines changed: 1473 additions & 489 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Development.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
# Development Guide
22

33
This guide is for people working on OpenHands and editing the source code.
4-
If you wish to contribute your changes, check out the [CONTRIBUTING.md](https://github.com/All-Hands-AI/OpenHands/blob/main/CONTRIBUTING.md) on how to clone and setup the project
5-
initially before moving on. Otherwise, you can clone the OpenHands project directly.
4+
If you wish to contribute your changes, check out the
5+
[CONTRIBUTING.md](https://github.com/All-Hands-AI/OpenHands/blob/main/CONTRIBUTING.md)
6+
on how to clone and setup the project initially before moving on. Otherwise,
7+
you can clone the OpenHands project directly.
68

79
## Start the Server for Development
810

@@ -19,9 +21,20 @@ initially before moving on. Otherwise, you can clone the OpenHands project direc
1921

2022
Make sure you have all these dependencies installed before moving on to `make build`.
2123

24+
#### Dev container
25+
26+
There is a [dev container](https://containers.dev/) available which provides a
27+
pre-configured environment with all the necessary dependencies installed if you
28+
are using a [supported editor or tool](https://containers.dev/supporting). For
29+
example, if you are using Visual Studio Code (VS Code) with the
30+
[Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
31+
extension installed, you can open the project in a dev container by using the
32+
_Dev Container: Reopen in Container_ command from the Command Palette
33+
(Ctrl+Shift+P).
34+
2235
#### Develop without sudo access
2336

24-
If you want to develop without system admin/sudo access to upgrade/install `Python` and/or `NodeJs`, you can use
37+
If you want to develop without system admin/sudo access to upgrade/install `Python` and/or `NodeJs`, you can use
2538
`conda` or `mamba` to manage the packages for you:
2639

2740
```bash
@@ -37,7 +50,7 @@ mamba install conda-forge::poetry
3750

3851
### 2. Build and Setup The Environment
3952

40-
Begin by building the project which includes setting up the environment and installing dependencies. This step ensures
53+
Begin by building the project which includes setting up the environment and installing dependencies. This step ensures
4154
that OpenHands is ready to run on your system:
4255

4356
```bash
@@ -54,11 +67,11 @@ To configure the LM of your choice, run:
5467
make setup-config
5568
```
5669

57-
This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenHands is
58-
tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI,
70+
This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenHands is
71+
tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI,
5972
please set the model in the UI.
6073

61-
Note: If you have previously run OpenHands using the docker command, you may have already set some environmental
74+
Note: If you have previously run OpenHands using the docker command, you may have already set some environmental
6275
variables in your terminal. The final configurations are set from highest to lowest priority:
6376
Environment variables > config.toml variables > default variables
6477

@@ -77,14 +90,14 @@ make run
7790

7891
#### Option B: Individual Server Startup
7992

80-
- **Start the Backend Server:** If you prefer, you can start the backend server independently to focus on
93+
- **Start the Backend Server:** If you prefer, you can start the backend server independently to focus on
8194
backend-related tasks or configurations.
8295

8396
```bash
8497
make start-backend
8598
```
8699

87-
- **Start the Frontend Server:** Similarly, you can start the frontend server on its own to work on frontend-related
100+
- **Start the Frontend Server:** Similarly, you can start the frontend server on its own to work on frontend-related
88101
components or interface enhancements.
89102
```bash
90103
make start-frontend
@@ -120,7 +133,7 @@ poetry run pytest ./tests/unit/test_*.py
120133

121134
### 9. Use existing Docker image
122135

123-
To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker
136+
To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker
124137
container image by setting the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
125138

126139
Example: `export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.39-nikolaik`

README_CN.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
2+
<a name="readme-top"></a>
3+
4+
<div align="center">
5+
<img src="./docs/static/img/logo.png" alt="Logo" width="200">
6+
<h1 align="center">OpenHands: 少写代码,多做事</h1>
7+
</div>
8+
9+
10+
<div align="center">
11+
<a href="https://github.com/All-Hands-AI/OpenHands/graphs/contributors"><img src="https://img.shields.io/github/contributors/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Contributors"></a>
12+
<a href="https://github.com/All-Hands-AI/OpenHands/stargazers"><img src="https://img.shields.io/github/stars/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Stargazers"></a>
13+
<a href="https://github.com/All-Hands-AI/OpenHands/blob/main/LICENSE"><img src="https://img.shields.io/github/license/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="MIT License"></a>
14+
<br/>
15+
<a href="https://join.slack.com/t/openhands-ai/shared_invite/zt-34zm4j0gj-Qz5kRHoca8DFCbqXPS~f_A"><img src="https://img.shields.io/badge/Slack-Join%20Us-red?logo=slack&logoColor=white&style=for-the-badge" alt="加入我们的Slack社区"></a>
16+
<a href="https://discord.gg/ESHStjSjD4"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="加入我们的Discord社区"></a>
17+
<a href="https://github.com/All-Hands-AI/OpenHands/blob/main/CREDITS.md"><img src="https://img.shields.io/badge/Project-Credits-blue?style=for-the-badge&color=FFE165&logo=github&logoColor=white" alt="致谢"></a>
18+
<br/>
19+
<a href="https://docs.all-hands.dev/modules/usage/getting-started"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="查看文档"></a>
20+
<a href="https://arxiv.org/abs/2407.16741"><img src="https://img.shields.io/badge/Paper%20on%20Arxiv-000?logoColor=FFE165&logo=arxiv&style=for-the-badge" alt="Arxiv论文"></a>
21+
<a href="https://docs.google.com/spreadsheets/d/1wOUdFCMyY6Nt0AIqF705KN4JKOWgeI4wUGUP60krXXs/edit?gid=0#gid=0"><img src="https://img.shields.io/badge/Benchmark%20score-000?logoColor=FFE165&logo=huggingface&style=for-the-badge" alt="评估基准分数"></a>
22+
<hr>
23+
</div>
24+
25+
欢迎使用OpenHands(前身为OpenDevin),这是一个由AI驱动的软件开发代理平台。
26+
27+
OpenHands代理可以完成人类开发者能做的任何事情:修改代码、运行命令、浏览网页、调用API,甚至从StackOverflow复制代码片段。
28+
29+
[docs.all-hands.dev](https://docs.all-hands.dev)了解更多信息,或[注册OpenHands Cloud](https://app.all-hands.dev)开始使用。
30+
31+
> [!IMPORTANT]
32+
> 在工作中使用OpenHands?我们很想与您交流!填写
33+
> [这份简短表格](https://docs.google.com/forms/d/e/1FAIpQLSet3VbGaz8z32gW9Wm-Grl4jpt5WgMXPgJ4EDPVmCETCBpJtQ/viewform)
34+
> 加入我们的设计合作伙伴计划,您将获得商业功能的早期访问权限,并有机会对我们的产品路线图提供意见。
35+
36+
![应用截图](./docs/static/img/screenshot.png)
37+
38+
## ☁️ OpenHands Cloud
39+
开始使用OpenHands的最简单方式是在[OpenHands Cloud](https://app.all-hands.dev)上,
40+
新用户可获得$50的免费额度。
41+
42+
## 💻 在本地运行OpenHands
43+
44+
OpenHands也可以使用Docker在本地系统上运行。
45+
查看[运行OpenHands](https://docs.all-hands.dev/modules/usage/installation)指南了解
46+
系统要求和更多信息。
47+
48+
> [!WARNING]
49+
> 在公共网络上?请参阅我们的[强化Docker安装指南](https://docs.all-hands.dev/modules/usage/runtimes/docker#hardened-docker-installation)
50+
> 通过限制网络绑定和实施其他安全措施来保护您的部署。
51+
52+
53+
```bash
54+
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik
55+
56+
docker run -it --rm --pull=always \
57+
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
58+
-e LOG_ALL_EVENTS=true \
59+
-v /var/run/docker.sock:/var/run/docker.sock \
60+
-v ~/.openhands-state:/.openhands-state \
61+
-p 3000:3000 \
62+
--add-host host.docker.internal:host-gateway \
63+
--name openhands-app \
64+
docker.all-hands.dev/all-hands-ai/openhands:0.39
65+
```
66+
67+
您将在[http://localhost:3000](http://localhost:3000)找到运行中的OpenHands!
68+
69+
打开应用程序时,您将被要求选择一个LLM提供商并添加API密钥。
70+
[Anthropic的Claude Sonnet 4](https://www.anthropic.com/api)`anthropic/claude-sonnet-4-20250514`
71+
效果最佳,但您还有[许多选择](https://docs.all-hands.dev/modules/usage/llms)
72+
73+
## 💡 运行OpenHands的其他方式
74+
75+
> [!CAUTION]
76+
> OpenHands旨在由单个用户在其本地工作站上运行。
77+
> 它不适合多租户部署,即多个用户共享同一实例。没有内置的身份验证、隔离或可扩展性。
78+
>
79+
> 如果您有兴趣在多租户环境中运行OpenHands,请
80+
> [与我们联系](https://docs.google.com/forms/d/e/1FAIpQLSet3VbGaz8z32gW9Wm-Grl4jpt5WgMXPgJ4EDPVmCETCBpJtQ/viewform)
81+
> 了解高级部署选项。
82+
83+
您还可以[将OpenHands连接到本地文件系统](https://docs.all-hands.dev/modules/usage/runtimes/docker#connecting-to-your-filesystem)
84+
以可编程的[无头模式](https://docs.all-hands.dev/modules/usage/how-to/headless-mode)运行OpenHands,
85+
通过[友好的CLI](https://docs.all-hands.dev/modules/usage/how-to/cli-mode)与其交互,
86+
或使用[GitHub Action](https://docs.all-hands.dev/modules/usage/how-to/github-action)在标记的问题上运行它。
87+
88+
访问[运行OpenHands](https://docs.all-hands.dev/modules/usage/installation)获取更多信息和设置说明。
89+
90+
如果您想修改OpenHands源代码,请查看[Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md)
91+
92+
遇到问题?[故障排除指南](https://docs.all-hands.dev/modules/usage/troubleshooting)可以提供帮助。
93+
94+
## 📖 文档
95+
<a href="https://deepwiki.com/All-Hands-AI/OpenHands"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki" title="DeepWiki自动生成文档"></a>
96+
97+
要了解有关项目的更多信息,以及使用OpenHands的技巧,
98+
请查看我们的[文档](https://docs.all-hands.dev/modules/usage/getting-started)
99+
100+
在那里,您将找到有关如何使用不同LLM提供商、
101+
故障排除资源和高级配置选项的资源。
102+
103+
## 🤝 如何加入社区
104+
105+
OpenHands是一个社区驱动的项目,我们欢迎每个人的贡献。我们大部分沟通
106+
通过Slack进行,因此这是开始的最佳场所,但我们也很乐意您通过Discord或Github与我们联系:
107+
108+
- [加入我们的Slack工作空间](https://join.slack.com/t/openhands-ai/shared_invite/zt-34zm4j0gj-Qz5kRHoca8DFCbqXPS~f_A) - 这里我们讨论研究、架构和未来发展。
109+
- [加入我们的Discord服务器](https://discord.gg/ESHStjSjD4) - 这是一个社区运营的服务器,用于一般讨论、问题和反馈。
110+
- [阅读或发布Github问题](https://github.com/All-Hands-AI/OpenHands/issues) - 查看我们正在处理的问题,或添加您自己的想法。
111+
112+
[COMMUNITY.md](./COMMUNITY.md)中了解更多关于社区的信息,或在[CONTRIBUTING.md](./CONTRIBUTING.md)中找到有关贡献的详细信息。
113+
114+
## 📈 进展
115+
116+
[这里](https://github.com/orgs/All-Hands-AI/projects/1)查看OpenHands月度路线图(每月月底在维护者会议上更新)。
117+
118+
<p align="center">
119+
<a href="https://star-history.com/#All-Hands-AI/OpenHands&Date">
120+
<img src="https://api.star-history.com/svg?repos=All-Hands-AI/OpenHands&type=Date" width="500" alt="Star History Chart">
121+
</a>
122+
</p>
123+
124+
## 📜 许可证
125+
126+
根据MIT许可证分发。有关更多信息,请参阅[`LICENSE`](./LICENSE)
127+
128+
## 🙏 致谢
129+
130+
OpenHands由大量贡献者构建,每一份贡献都备受感谢!我们还借鉴了其他开源项目,对他们的工作深表感谢。
131+
132+
有关OpenHands中使用的开源项目和许可证列表,请参阅我们的[CREDITS.md](./CREDITS.md)文件。
133+
134+
## 📚 引用
135+
136+
```
137+
@misc{openhands,
138+
title={{OpenHands: An Open Platform for AI Software Developers as Generalist Agents}},
139+
author={Xingyao Wang and Boxuan Li and Yufan Song and Frank F. Xu and Xiangru Tang and Mingchen Zhuge and Jiayi Pan and Yueqi Song and Bowen Li and Jaskirat Singh and Hoang H. Tran and Fuqiang Li and Ren Ma and Mingzhang Zheng and Bill Qian and Yanjun Shao and Niklas Muennighoff and Yizhe Zhang and Binyuan Hui and Junyang Lin and Robert Brennan and Hao Peng and Heng Ji and Graham Neubig},
140+
year={2024},
141+
eprint={2407.16741},
142+
archivePrefix={arXiv},
143+
primaryClass={cs.SE},
144+
url={https://arxiv.org/abs/2407.16741},
145+
}
146+
```

config.template.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,15 @@ classpath = "my_package.my_module.MyCustomAgent"
328328
# Useful when deploying OpenHands in a remote machine where you need to expose a specific port.
329329
#vscode_port = 41234
330330

331+
# Volume mounts in the format 'host_path:container_path[:mode]'
332+
# e.g. '/my/host/dir:/workspace:rw'
333+
# Multiple mounts can be specified using commas
334+
# e.g. '/path1:/workspace/path1,/path2:/workspace/path2:ro'
335+
336+
# Configure volumes under the [sandbox] section:
337+
# [sandbox]
338+
# volumes = "/my/host/dir:/workspace:rw,/path2:/workspace/path2:ro"
339+
331340
#################################### Security ###################################
332341
# Configuration for security features
333342
##############################################################################

docs/modules/usage/configuration-options.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,8 @@ The agent configuration options are defined in the `[agent]` and `[agent.<agent_
331331

332332
The sandbox configuration options are defined in the `[sandbox]` section of the `config.toml` file.
333333

334+
335+
334336
To use these with the docker command, pass in `-e SANDBOX_<option>`. Example: `-e SANDBOX_TIMEOUT`.
335337

336338
### Execution

evaluation/benchmarks/swe_bench/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
This folder contains the evaluation harness that we built on top of the original [SWE-Bench benchmark](https://www.swebench.com/) ([paper](https://arxiv.org/abs/2310.06770)).
44

5+
**UPDATE (5/26/2025): We now support running interactive SWE-Bench evaluation (see the paper [here](https://arxiv.org/abs/2502.13069))! For how to run it, checkout [this README](./SWE-Interact.md).**
6+
57
**UPDATE (4/8/2025): We now support running SWT-Bench evaluation! For more details, checkout [the corresponding section](#SWT-Bench-Evaluation).**
68

79
**UPDATE (03/27/2025): We now support SWE-Bench multimodal evaluation! Simply use "princeton-nlp/SWE-bench_Multimodal" as the dataset name in the `run_infer.sh` script to evaluate on multimodal instances.**
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# SWE-Interact Benchmark
2+
3+
This document explains how to use the [Interactive SWE-Bench](https://arxiv.org/abs/2502.13069) benchmark scripts for running and evaluating interactive software engineering tasks.
4+
5+
## Setting things up
6+
After following the [README](./README.md) to set up the environment, you would need to additionally add LLM configurations for simulated human users. In the original [paper](https://arxiv.org/abs/2502.13069), we use gpt-4o as the simulated human user. You can add the following to your `config.toml` file:
7+
8+
```toml
9+
[llm.fake_user]
10+
model="litellm_proxy/gpt-4o-2024-08-06"
11+
api_key="<your-api-key>"
12+
temperature = 0.0
13+
base_url = "https://llm-proxy.eval.all-hands.dev"
14+
```
15+
16+
## Running the Benchmark
17+
18+
The main script for running the benchmark is `run_infer_interact.sh`. Here's how to use it:
19+
20+
```bash
21+
bash ./evaluation/benchmarks/swe_bench/scripts/run_infer_interact.sh <model_config> <commit_hash> <agent> <eval_limit> <max_iter> <num_workers> <split>
22+
```
23+
24+
### Parameters:
25+
26+
- `model_config`: Path to the LLM configuration file (e.g., `llm.claude-3-7-sonnet`)
27+
- `commit_hash`: Git commit hash to use (e.g., `HEAD`)
28+
- `agent`: The agent class to use (e.g., `CodeActAgent`)
29+
- `eval_limit`: Number of examples to evaluate (e.g., `500`)
30+
- `max_iter`: Maximum number of iterations per task (e.g., `100`)
31+
- `num_workers`: Number of parallel workers (e.g., `1`)
32+
- `split`: Dataset split to use (e.g., `test`)
33+
34+
### Example:
35+
36+
```bash
37+
bash ./evaluation/benchmarks/swe_bench/scripts/run_infer_interact.sh llm.claude-3-7-sonnet HEAD CodeActAgent 500 100 1 test
38+
```
39+
40+
### Additional Environment Variables:
41+
42+
You can customize the behavior using these environment variables:
43+
44+
- `RUN_WITH_BROWSING`: Enable/disable web browsing (default: false)
45+
- `USE_HINT_TEXT`: Enable/disable hint text (default: false)
46+
- `EVAL_CONDENSER`: Specify a condenser configuration
47+
- `EXP_NAME`: Add a custom experiment name to the output
48+
- `N_RUNS`: Number of runs to perform (default: 1)
49+
- `SKIP_RUNS`: Comma-separated list of run numbers to skip
50+
51+
## Evaluating Results
52+
53+
After running the benchmark, you can evaluate the results using `eval_infer.sh`:
54+
55+
```bash
56+
./evaluation/benchmarks/swe_bench/scripts/eval_infer.sh <output_file> <instance_id> <dataset> <split>
57+
```
58+
59+
### Parameters:
60+
61+
- `output_file`: Path to the output JSONL file
62+
- `instance_id`: The specific instance ID to evaluate
63+
- `dataset`: Dataset name (e.g., `cmu-lti/interactive-swe`)
64+
- `split`: Dataset split (e.g., `test`)
65+
66+
### Example:
67+
68+
```bash
69+
./evaluation/benchmarks/swe_bench/scripts/eval_infer.sh evaluation/evaluation_outputs/outputs/cmu-lti__interactive-swe-test/CodeActAgent/claude-3-7-sonnet-20250219_maxiter_100_N_v0.39.0-no-hint-run_1/output.jsonl sphinx-doc__sphinx-8721 cmu-lti/interactive-swe test
70+
```
71+
72+
## Output Structure
73+
74+
The benchmark outputs are stored in the `evaluation/evaluation_outputs/outputs/` directory with the following structure:
75+
76+
```
77+
evaluation/evaluation_outputs/outputs/
78+
└── cmu-lti__interactive-swe-{split}/
79+
└── {agent}/
80+
└── {model}-{date}_maxiter_{max_iter}_N_{version}-{options}-run_{run_number}/
81+
└── output.jsonl
82+
```
83+
84+
Where:
85+
- `{split}` is the dataset split (e.g., test)
86+
- `{agent}` is the agent class name
87+
- `{model}` is the model name
88+
- `{date}` is the run date
89+
- `{max_iter}` is the maximum iterations
90+
- `{version}` is the OpenHands version
91+
- `{options}` includes any additional options (e.g., no-hint, with-browsing)
92+
- `{run_number}` is the run number

0 commit comments

Comments
 (0)