Skip to content

Commit 97bbe97

Browse files
committed
merge in new prompt with reasoning example
2 parents 725b734 + 76abb9c commit 97bbe97

25 files changed

+3483
-164
lines changed

AGENTS.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -156,22 +156,25 @@ Environment implementations live in `rlm/environments/`. Choose the appropriate
156156
- Inherit from `NonIsolatedEnv` or `IsolatedEnv` in `rlm/environments/base_env.py`
157157
- Implement all abstract methods: `setup`, `load_context`, `execute_code`
158158
- Return `REPLResult` from `execute_code`
159-
- Handle `lm_handler_address` for sub-LM calls via `llm_query()`
159+
- Handle `lm_handler_address` for LM calls via `llm_query()` and `rlm_query()`
160160
- Implement `cleanup()` for resource management
161161
- Register environment in `rlm/environments/__init__.py`
162162

163163
### Key Implementation Details
164164
- `setup()`: Initialize globals, locals, and helper functions
165165
- `load_context()`: Make context available as `context` variable
166166
- `execute_code()`: Execute code, capture stdout/stderr, return `REPLResult`
167-
- Always provide `llm_query` and `llm_query_batched` functions in environment globals
167+
- Always provide `llm_query`, `llm_query_batched`, `rlm_query`, and `rlm_query_batched` functions in environment globals
168168

169169
### State Management
170170
Environments must provide these globals to executed code:
171171
- `context`: The loaded context payload
172-
- `llm_query(prompt, model=None)`: For sub-LM calls
173-
- `llm_query_batched(prompts, model=None)`: For batched sub-LM calls
172+
- `llm_query(prompt, model=None)`: Plain single LM completion (no REPL, no iteration)
173+
- `llm_query_batched(prompts, model=None)`: Batched plain LM completions
174+
- `rlm_query(prompt, model=None)`: Recursive child RLM call (own REPL + iteration). Falls back to `llm_query` at max depth.
175+
- `rlm_query_batched(prompts, model=None)`: Batched recursive child RLM calls
174176
- `FINAL_VAR(variable_name)`: For returning final answers
177+
- `SHOW_VARS()`: For listing available variables
175178

176179
### Example Structure
177180
```python
@@ -204,7 +207,8 @@ class MyEnvironment(NonIsolatedEnv):
204207
- Guidelines here are followed
205208
- Environment works with basic RLM completion calls
206209
- `cleanup()` properly releases all resources
207-
- Sub-LM calls work via `llm_query()`
210+
- Sub-LM calls work via `llm_query()` and `rlm_query()`
211+
- Reserved names (`llm_query`, `rlm_query`, `context`, `history`, `FINAL_VAR`, `SHOW_VARS`) are restored after each execution
208212

209213
## Architecture: Environment ↔ LM Handler Communication
210214

@@ -223,7 +227,7 @@ Understanding how environments communicate with the LM Handler is essential for
223227
│ ▼ │ │
224228
│ ┌─────────────┐ Socket (TCP) │ │
225229
│ │ LocalREPL │────────────────────────────────────┘ │
226-
│ │ (exec code) │ llm_query() → send_lm_request()
230+
│ │ (exec code) │ llm_query() / rlm_query() → LM calls
227231
│ └─────────────┘ │
228232
└─────────────────────────────────────────────────────────────────────┘
229233
```
@@ -242,8 +246,8 @@ def socket_send(sock: socket.socket, data: dict) -> None:
242246
```
243247

244248
**Request Flow**:
245-
1. Environment's `llm_query(prompt)` is called during code execution
246-
2. Creates `LMRequest` dataclass and calls `send_lm_request(address, request)`
249+
1. Environment's `llm_query(prompt)` or `rlm_query(prompt)` is called during code execution
250+
2. For `llm_query`: creates `LMRequest` and calls `send_lm_request(address, request)`. For `rlm_query`: invokes `subcall_fn` to spawn a child RLM (or falls back to `llm_query` at max depth).
247251
3. Opens TCP connection to `LMHandler` at `(host, port)`
248252
4. Sends length-prefixed JSON request
249253
5. `LMHandler` processes via `LMRequestHandler.handle()`

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -77,11 +77,11 @@ make quickstart
7777
</details>
7878

7979
## REPL Environments
80-
We support two types of REPL environments -- isolated, and non-isolated. Non-isolated environments (default) run code execution on the same machine as the RLM (e.g. through `exec`), which is pretty reasonable for some local low-risk tasks, like simple benchmarking, but can be problematic if the prompts or tool calls can interact with malicious users. Fully isolated environments used Cloud-based sandboxes (e.g. Prime Sandboxes, [Modal Sandboxes](https://modal.com/docs/guide/sandboxes)) to run code generated by the RLM, ensuring completely isolation from the host process. Environments can be added, but we natively support the following: `local` (default), `modal`, `prime`.
80+
We support two types of REPL environments -- isolated, and non-isolated. Non-isolated environments (default) run code execution on the same machine as the RLM (e.g. through `exec`), which is pretty reasonable for some local low-risk tasks, like simple benchmarking, but can be problematic if the prompts or tool calls can interact with malicious users. Fully isolated environments use cloud-based sandboxes (e.g. Prime Sandboxes, [Modal Sandboxes](https://modal.com/docs/guide/sandboxes)) to run code generated by the RLM, ensuring complete isolation from the host process. Environments can be added, but we natively support the following: `local` (default), `docker`, `modal`, `prime`, `daytona`, `e2b`.
8181

8282
```python
8383
rlm = RLM(
84-
environment="...", # "local", "docker", "modal", "prime"
84+
environment="...", # "local", "docker", "modal", "prime", "daytona", "e2b"
8585
environment_kwargs={...},
8686
)
8787
```
@@ -124,19 +124,19 @@ We currently support most major clients (OpenAI, Anthropic), as well as the rout
124124
If you use this code or repository in your research, please cite:
125125

126126
```bibtex
127-
@misc{zhang2025recursivelanguagemodels,
128-
title={Recursive Language Models},
127+
@misc{zhang2026recursivelanguagemodels,
128+
title={Recursive Language Models},
129129
author={Alex L. Zhang and Tim Kraska and Omar Khattab},
130-
year={2025},
130+
year={2026},
131131
eprint={2512.24601},
132132
archivePrefix={arXiv},
133133
primaryClass={cs.AI},
134-
url={https://arxiv.org/abs/2512.24601},
134+
url={https://arxiv.org/abs/2512.24601},
135135
}
136136
```
137137

138138
## Optional: Trajectory metadata and logging
139-
`RLMChatCompletion` has an optional `metadata` field (default empty) that can hold the full trajectory (run config + all iterations and sub-calls) so you can reconstruct the run. Pass an `RLMLogger` to capture it:
139+
`RLMChatCompletion` has an optional `metadata` field (default `None`) that holds the full trajectory (run config + all iterations and sub-calls) so you can reconstruct the run. Pass an `RLMLogger` to capture it:
140140

141141
- **In-memory only** (trajectory on `completion.metadata`): `logger=RLMLogger()` (no `log_dir`).
142142
- **Also save to disk** (JSONL for the visualizer): `logger=RLMLogger(log_dir="./logs")`.

0 commit comments

Comments
 (0)