Skip to content

Commit 777a985

Browse files
Lawhyclaude
andcommitted
chore: add CLAUDE.md, auto-detect model ID and health check for integration tests
Remove --sglang-model-id CLI option; model ID is now auto-detected from the server via /get_model_info. Integration tests skip automatically if the SGLang server is unreachable (/health check). Co-Authored-By: Claude Opus 4.5 <[email protected]>
1 parent 9eeda06 commit 777a985

File tree

3 files changed

+96
-18
lines changed

3 files changed

+96
-18
lines changed

CLAUDE.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Strands-env is an RL environment abstraction for Strands agents — step, observe, reward. It provides a base `Environment` class that wraps a Strands `Agent` with token-level observation tracking (TITO), reward computation, and termination handling. Supports SGLang, Bedrock, and OpenAI model backends.
8+
9+
## Commands
10+
11+
### Setup
12+
```bash
13+
pip install -e ".[dev]"
14+
```
15+
16+
### Linting
17+
```bash
18+
ruff check src/
19+
ruff format --check src/
20+
```
21+
22+
### Testing
23+
```bash
24+
# Unit tests (no server needed)
25+
pytest tests/unit/ -v
26+
27+
# Single test
28+
pytest tests/unit/test_environment.py::TestStep::test_successful_step -v
29+
30+
# Unit tests with coverage
31+
pytest tests/unit/ -v --cov=src/strands_env --cov-report=html
32+
33+
# Integration tests (requires running SGLang server; model ID auto-detected via /get_model_info)
34+
# Tests skip automatically if server is unreachable (/health check)
35+
pytest tests/integration/ -v --sglang-base-url=http://localhost:30000
36+
# Or via env var: SGLANG_BASE_URL=http://localhost:30000 pytest tests/integration/
37+
```
38+
39+
### Integration Tests with Remote GPU Server
40+
41+
```bash
42+
# 1. Launch SGLang on the remote server in docker
43+
ssh <remote-host> "sudo docker run -d --gpus '\"device=0\"' --name sglang-test -p 30000:30000 --ipc=host lmsysorg/sglang:<tag> python3 -m sglang.launch_server --model-path <model-id> --host 0.0.0.0 --port 30000 --tp <num_gpus> --mem-fraction-static 0.7"
44+
# 2. Tunnel the port locally
45+
ssh -L 30000:localhost:30000 -N -f <remote-host>
46+
# 3. Run tests locally
47+
pytest tests/integration/ -v
48+
```
49+
50+
## Architecture
51+
52+
The package lives in `src/strands_env/core/` with three modules:
53+
54+
**types.py** — All data types. `Action` carries a user message + `TaskContext` (ground truth, conversation history, arbitrary metadata via `extra="allow"`). `Observation` holds messages, metrics, and optional `TokenObservation` for TITO training. `TerminationReason` maps agent exceptions to enum values via `from_error()` which walks exception cause chains. `StepResult` bundles observation + reward + termination reason.
55+
56+
**models.py**`ModelFactory = Callable[[], Model]` type and three factory functions (`sglang_model_factory`, `bedrock_model_factory`, `openai_model_factory`). Each returns a zero-arg lambda that creates a fresh Model instance per `step()` call for concurrent isolation. Bedrock and OpenAI remap `max_new_tokens``max_tokens` with a shallow dict copy to avoid mutating defaults.
57+
58+
**environment.py** — Base `Environment` class. `step(action)` creates a fresh model via factory, attaches a `TokenManager`, builds an `Agent` with tools/hooks (always includes `ToolIterationLimiter`), runs `invoke_async`, then collects metrics and optional reward. Subclasses override `get_tools()` and `get_hooks()` to customize. Messages are sliced so only new messages from the current step appear in the observation.
59+
60+
### Key Design Decisions
61+
62+
- **Factory pattern**: `ModelFactory` returns lambdas (not Model instances) so each `step()` gets a fresh model with clean token tracking state.
63+
- **TITO token tracking**: `TokenManager` on SGLang models captures exact token IDs and logprobs during generation. `TokenObservation.from_token_manager()` extracts prompt/rollout split. Non-SGLang models get an empty `TokenManager` (returns `None` from `from_token_manager`).
64+
- **`list()` copies**: Tools, hooks, and messages are copied via `list()` before passing to Agent to prevent cross-step mutation.
65+
- **ToolIterationLimiter**: Always prepended to hooks list. Raises `MaxToolIterationsReachedError` which `TerminationReason.from_error()` maps to `MAX_TOOL_ITERATIONS_REACHED`.
66+
67+
## Code Style
68+
69+
- Ruff for linting and formatting (line-length 120, rules: E, F, I, N, W)
70+
- Conventional commits (feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert)
71+
- Python 3.10+ required
72+
- asyncio_mode = "auto" for pytest-asyncio
73+
- Async-first: all Environment methods that interact with Agent are async

tests/conftest.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
1111
Configuration:
1212
pytest tests/integration/ --sglang-base-url=http://localhost:30000
13-
pytest tests/integration/ --sglang-model-id=Qwen/Qwen3-4B-Instruct-2507
1413
1514
Or via environment variables:
1615
SGLANG_BASE_URL=http://localhost:30000 pytest tests/integration/
@@ -27,12 +26,6 @@ def pytest_addoption(parser):
2726
default=os.environ.get("SGLANG_BASE_URL", "http://localhost:30000"),
2827
help="SGLang server URL (default: http://localhost:30000 or SGLANG_BASE_URL env var)",
2928
)
30-
parser.addoption(
31-
"--sglang-model-id",
32-
action="store",
33-
default=os.environ.get("SGLANG_MODEL_ID", "Qwen/Qwen3-4B-Instruct-2507"),
34-
help="Model ID (default: Qwen/Qwen3-4B-Instruct-2507 or SGLANG_MODEL_ID env var)",
35-
)
3629

3730

3831
def pytest_configure(config):

tests/integration/conftest.py

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11
"""Shared fixtures for integration tests.
22
33
All tests in this directory require a running SGLang server.
4+
The model ID is auto-detected from the server via /get_model_info.
5+
Tests are skipped automatically if the server is not reachable.
46
57
Configuration (priority: CLI > env var > default):
6-
pytest --sglang-base-url=http://localhost:30000 --sglang-model-id=Qwen/Qwen3-4B-Instruct-2507
7-
SGLANG_BASE_URL=http://... SGLANG_MODEL_ID=... pytest tests/integration/
8+
pytest --sglang-base-url=http://localhost:30000
9+
SGLANG_BASE_URL=http://... pytest tests/integration/
810
"""
911

12+
import httpx
1013
import pytest
1114
from strands_sglang import SGLangClient
1215
from transformers import AutoTokenizer
@@ -24,21 +27,30 @@ def sglang_base_url(request):
2427

2528

2629
@pytest.fixture(scope="session")
27-
def sglang_model_id(request):
28-
"""Get model ID from pytest config."""
29-
return request.config.getoption("--sglang-model-id")
30+
def sglang_client(sglang_base_url):
31+
"""Shared SGLang client for connection pooling. Skips all tests if server is unreachable."""
32+
try:
33+
response = httpx.get(f"{sglang_base_url}/health", timeout=5)
34+
healthy = response.status_code == 200
35+
except httpx.HTTPError:
36+
healthy = False
37+
if not healthy:
38+
pytest.skip(f"SGLang server not reachable at {sglang_base_url}")
39+
return SGLangClient(sglang_base_url)
3040

3141

3242
@pytest.fixture(scope="session")
33-
def tokenizer(sglang_model_id):
34-
"""Load tokenizer for the configured model."""
35-
return AutoTokenizer.from_pretrained(sglang_model_id)
43+
def sglang_model_id(sglang_base_url, sglang_client):
44+
"""Auto-detect model ID from the running SGLang server."""
45+
response = httpx.get(f"{sglang_base_url}/get_model_info", timeout=5)
46+
response.raise_for_status()
47+
return response.json()["model_path"]
3648

3749

3850
@pytest.fixture(scope="session")
39-
def sglang_client(sglang_base_url):
40-
"""Shared SGLang client for connection pooling."""
41-
return SGLangClient(sglang_base_url)
51+
def tokenizer(sglang_model_id):
52+
"""Load tokenizer for the detected model."""
53+
return AutoTokenizer.from_pretrained(sglang_model_id)
4254

4355

4456
@pytest.fixture

0 commit comments

Comments
 (0)