Skip to content

Commit 83e70ac

Browse files
authored
docs: add TESTING.md for python (strands-agents#2961)
1 parent 40c709c commit 83e70ac

2 files changed

Lines changed: 83 additions & 2 deletions

File tree

strands-py/docs/TESTING.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Testing Guidelines - Strands Python SDK
2+
3+
> **IMPORTANT**: When writing tests, you **MUST** follow the guidelines in this document. They keep tests consistent, maintainable, and resilient to unrelated changes.
4+
5+
This document is the authoritative testing reference for the Python SDK. For general development guidance, see [AGENTS.md](../AGENTS.md).
6+
7+
## Test Layout
8+
9+
- **Unit tests** mirror `src/strands/` exactly under `tests/strands/``tests/strands/agent/test_agent.py` tests `src/strands/agent/agent.py`. Do not put unit tests anywhere else.
10+
- **Integration tests** live in `tests_integ/`, organized by feature area, and run only in CI (they need real provider credentials). Whole-message-dict assertions in integ tests are especially brittle — see [Assertions](#assertions).
11+
- **Test files are prefixed `test_`**; test functions are prefixed `test_`.
12+
13+
```bash
14+
hatch test # Run unit tests
15+
hatch test -c # Run with coverage
16+
hatch test tests/strands/agent/ # Run a specific directory
17+
hatch run test-integ # Run integration tests
18+
hatch test --all # All Python versions (3.10–3.14)
19+
```
20+
21+
## Test Fixtures Quick Reference
22+
23+
Reusable fixtures live in `tests/fixtures/`; shared async/AWS helpers are session-scoped fixtures in `tests/conftest.py`. **Reuse these — do not hand-roll a mock model, hook recorder, or tool.** If you find yourself writing one, check here first.
24+
25+
| Fixture / helper | Location | When to use |
26+
| --- | --- | --- |
27+
| `MockedModelProvider` | `tests/fixtures/mocked_model_provider.py` | Drive the agent loop with a pre-defined sequence of responses (optionally with per-response `Usage` for metrics paths) instead of a real provider |
28+
| `MockHookProvider` | `tests/fixtures/mock_hook_provider.py` | Record hook invocations and assert which lifecycle events fired |
29+
| `MockMultiAgentHookProvider` | `tests/fixtures/mock_multiagent_hook_provider.py` | Same, for multi-agent lifecycle events |
30+
| `MockAgentTool` | `tests/fixtures/mock_agent_tool.py` | A stand-in `AgentTool` when a tool is incidental to the test |
31+
| `MockedSessionRepository` | `tests/fixtures/mock_session_repository.py` | In-memory `SessionRepository` for session/persistence tests |
32+
| `agenerator` | `tests/conftest.py` | Wrap a list as an async generator (feed a `MockedModelProvider.stream` or any `async for`) |
33+
| `alist` | `tests/conftest.py` | Collect an async iterable into a list (`events = await alist(agent.stream_async(...))`) |
34+
| `generate` | `tests/conftest.py` | Drive a sync generator to exhaustion, returning `(yielded_events, return_value)` |
35+
| `moto_env`, `moto_mock_aws` | `tests/conftest.py` | Mock AWS credentials/services with `moto` — never hit real AWS in a unit test |
36+
37+
## Async Tests
38+
39+
Pytest runs in **strict asyncio mode** (the default — `asyncio_mode` is not set). Every coroutine test **MUST** carry an explicit marker:
40+
41+
```python
42+
@pytest.mark.asyncio
43+
async def test_streams_tool_use(agenerator):
44+
...
45+
```
46+
47+
A coroutine test without the marker is silently skipped, not run. Consume agent/model streams with the `alist` fixture rather than re-implementing the `async for` collection loop.
48+
49+
## Assertions
50+
51+
**Name the assertion pair `tru_<noun>` / `exp_<noun>`.** This is the one sanctioned exception to the "name every variable for its content" rule — the matched prefixes make arrange/assert pairs scannable. Compute the actual value into `tru_<noun>`, define the expected value as `exp_<noun>`, then compare:
52+
53+
```python
54+
tru_result = agent("hello")
55+
exp_result = {"role": "assistant", "content": [{"text": "hi"}]}
56+
assert tru_result == exp_result
57+
```
58+
59+
The right granularity depends on **who controls the expected shape**:
60+
61+
**When you control the shape, assert the whole object — not field-by-field.** A single equality on the whole structure documents the expected shape and catches unexpected extra fields. Mask only genuinely volatile fields (timestamps, generated ids, metadata) with `unittest.mock.ANY`:
62+
63+
```python
64+
tru_message = result.message
65+
exp_message = {"role": "assistant", "content": [{"text": "abc"}], "metadata": unittest.mock.ANY}
66+
assert tru_message == exp_message
67+
```
68+
69+
**When the shape is an externally-evolving type, assert only the fields the test is about.** This is the natural exception to the rule above: a type like `Message` grows fields the test doesn't care about, so asserting the entire dict couples the test to unrelated churn. If the test is about the text or role, assert just that (`assert result.message["role"] == "assistant"`) — pinning the whole `Message` shape is a recurring source of breakage in `tests_integ/`.
70+
71+
## Test Organization
72+
73+
- **Name tests `test_<method>_<description>`.** The test module already names the subject, so the class/subject should be omitted: `test__init__default_model_id`, `test_update_config_validation_warns_on_unknown_keys`. Add a subject prefix (`test_<subject>_<method>_<description>`) **only** when one module covers several subjects and the bare method name would be ambiguous.
74+
- **Default to flat, module-level functions.** Most of the suite is flat.
75+
- **Use a `class Test<Subject>` only when tests share class-scoped setup or fixtures** — not merely to group related cases (a module already groups them). Inside a class, the method name can drop the redundant subject since the class supplies it.
76+
- **Parametrize repetitive cases** with `@pytest.mark.parametrize` instead of copy-pasting a test body across inputs.
77+
- **Keep tests focused and independent**; import packages at the top of the file.
78+
79+
## Comments in Tests
80+
81+
The evergreen-comment rule applies here too: a regression test should **link the issue and describe the behavior it now guarantees**, not narrate what the code used to do. Prefer `# guards against unbounded retry on a None tool name (#642)` over `# we used to hang on None here`.

strands-ts/docs/TESTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,10 @@ src/subdir/
4545

4646
### Integration Test Location
4747

48-
**Rule**: Integration tests are separate in `tests_integ/`
48+
**Rule**: Integration tests are separate in `test/integ/`
4949

5050
```
51-
tests_integ/
51+
test/integ/
5252
├── api.test.ts # Tests public API
5353
└── environment.test.ts # Tests environment compatibility
5454
```

0 commit comments

Comments
 (0)