diff --git a/AGENTS.md b/AGENTS.md index c9540ef258..301820664d 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -41,9 +41,25 @@ Before designing a feature or changing an API, read the relevant context in `tea - **Skills**: Reusable, repo-specific workflows live under `.agents/skills/` — for PRs (`pr-create`, `pr-writer`, `pr-feedback`), docs (`docs-writer`, `docs-reviewer`, `docs-audit`, `docs-planner`), and code review (`strands-review`). See [`.agents/skills/README.md`](./.agents/skills/README.md) for what each does and when to use it. - **Doc `sourceLinks` track source files**: Doc pages under `site/` point at their backing implementation via `sourceLinks` frontmatter — repo-relative paths into `strands-py/` and `strands-ts/`. When you **rename or move a source file**, update any `sourceLinks` that reference its old path in the same change. The site build only fails on a malformed path or an unmapped file extension, **not** on a path that still resolves to the wrong (or now-nonexistent) file, so a stale reference rots silently. Find affected pages with `grep -rn "" site/src/content/docs`. +### Cross-SDK Conventions + +These rules apply to **both** SDKs. Each sub-guide (`strands-py/AGENTS.md`, `strands-ts/AGENTS.md`) shows the language-idiomatic form; the shared intent lives here so the two cannot drift apart. The two SDKs aim for parity in *concepts and names*, not identical code. + +- **Plugin / construct naming**: name a construct for what it *does*, not for the interface it implements. `AgentSkills`, `ContextOffloader`, `GoalLoop` — never an `…Plugin` suffix. (Python's `vended_plugins` and TS's `vended-plugins` already follow this.) +- **Cross-SDK parity**: when a name, constant, or hook event exists in both SDKs, keep them in sync. + - **Identifiers** match, re-cased to the language idiom (`snake_case` ↔ `camelCase`). + - **Single-word string-literal values** are byte-identical (`'user'`, `'success'`). + - **Multi-word string-literal values** are `snake_case` in Python and `camelCase` in TypeScript (`tool_use` ↔ `toolUse`); convert via an explicit map, never emit the other language's casing. + - **Wire field names** (keys exchanged with a provider API) keep their wire format in both SDKs even when it breaks the language's casing convention (`inputSchema`, `tool_use_id`). + - **Hook event names are shared** across SDKs (modulo the suffix convention). When you add a hook event in one SDK, add the matching name in the other. +- **Public vs internal API**: mark anything exported-but-not-public so consumers don't depend on it — Python keeps it out of `__all__` (and should prefix the module `_`); TypeScript keeps it out of the `index.ts` barrel and tags it `@internal`. +- **Structured logging format**: `field=, field= | lowercase human-readable message`, no punctuation, pipe-separate multiple statements. Python interpolates with `%s` (never f-strings; ruff `G` enforces it); TypeScript uses template literals (never printf `%s`/`%d`). +- **Evergreen comments**: comments explain WHAT/WHY, never how the code changed or what it used to be ("improved", "previously", "used to", "which would previously have crashed"). This applies to tests too — a regression test links its issue and states the behavior it guarantees. ("deprecated"/"legacy" is fine when it describes a stable API surface or runtime state; it's forbidden only when narrating how the code itself changed.) +- **Directory & file naming parity**: subsystem directories use the language-idiomatic separator (`snake_case` in Python, `kebab-case` in TypeScript) but the stem matches word-for-word so they are mechanically translatable (`vended_plugins/` ↔ `vended-plugins/`, `conversation_manager/` ↔ `conversation-manager/`). + ### Testing -When writing tests, follow the sub-project's testing guidance — e.g. `strands-ts/docs/TESTING.md` for the TypeScript SDK. +When writing tests, follow the sub-project's testing guidance — `strands-py/docs/TESTING.md` for the Python SDK, `strands-ts/docs/TESTING.md` for the TypeScript SDK. **`test-infra/` guardrails.** The `test-infra/` CDK stack deploys real AWS resources (Bedrock KBs, EC2 instances) that a small subset of integration tests depend on. Most tests do not need it — they run without provisioned infrastructure. diff --git a/strands-py/AGENTS.md b/strands-py/AGENTS.md index 447edc67d2..c7ca7ef37f 100644 --- a/strands-py/AGENTS.md +++ b/strands-py/AGENTS.md @@ -1,6 +1,8 @@ -# AGENTS.md +# Agent Development Guide - Python SDK -This document provides context, patterns, and guidelines for AI coding assistants working in this repository. For human contributors, see [CONTRIBUTING.md](../CONTRIBUTING.md). +This document provides context, patterns, and guidelines for AI coding assistants working on the Strands Python SDK (`strands-py/`). For human contributors, see [CONTRIBUTING.md](../CONTRIBUTING.md). + +> **Cross-SDK rules live in the [root AGENTS.md](../AGENTS.md).** Plugin naming, cross-SDK parity, public/internal API marking, the structured-logging format, and the evergreen-comment rule apply to both SDKs and are stated once there — this file shows only the Python-idiomatic form and the rules unique to Python. When a rule applies to both SDKs, edit the root, not this file. ## Product Overview @@ -54,72 +56,67 @@ pre-commit install -t pre-commit -t commit-msg # Install hooks ### 3. Pull Request Guidelines -When creating pull requests, you MUST follow the guidelines in PR.md. Key principles: +When creating pull requests, you MUST follow the guidelines in [PR.md](../team/PR.md). Key principles: + +- Focus on WHY: Explain motivation and user impact, not implementation details +- Document public API changes: Show before/after code examples +- Be concise: Use prose over bullet lists; avoid exhaustive checklists +- Target senior engineers: Assume familiarity with the SDK +- Exclude implementation details: Leave these to code comments and diffs -Focus on WHY: Explain motivation and user impact, not implementation details -Document public API changes: Show before/after code examples -Be concise: Use prose over bullet lists; avoid exhaustive checklists -Target senior engineers: Assume familiarity with the SDK -Exclude implementation details: Leave these to code comments and diffs -See PR.md for the complete guidance and template. +See [PR.md](../team/PR.md) for the complete guidance and template. ### 4. Quality Gates -Pre-commit hooks run automatically on commit: -- Formatting (ruff) -- Linting (ruff + mypy) -- Tests (pytest) -- Commit message validation (commitizen) +Pre-commit hooks run automatically on commit and must all pass: formatting (ruff), linting (ruff + mypy strict), tests (pytest), and commit-message validation (commitizen). Run them yourself with `hatch fmt --formatter` and `hatch fmt --linter` before committing. -All checks must pass before commit is allowed. +**ruff, isort, mypy, and pydocstyle already enforce import order, line length (120), logging format, docstring presence, and type-annotation coverage.** This guide does not re-list those — it covers the conventions a linter *cannot* check. When a rule below says "enforced by review," it means exactly that: no tool catches it, so it is on you and the reviewer. ## Coding Patterns and Best Practices -### Logging Style - -Use structured logging with field-value pairs followed by human-readable messages: - -```python -logger.debug("field1=<%s>, field2=<%s> | human readable message", field1, field2) -``` +### Logging -**Guidelines:** -- Add context as `FIELD=` pairs at the beginning -- Separate pairs with commas -- Enclose values in `<>` for readability (especially for empty values) -- Use `%s` string interpolation (not f-strings) for performance -- Use lowercase messages, no punctuation -- Separate multiple statements with pipe `|` +Structured logging is a cross-SDK rule — see the format in the [root AGENTS.md](../AGENTS.md). The Python-idiomatic form uses `%s` interpolation (never f-strings; ruff `G` enforces this) so interpolation is skipped when the log level is disabled: -**Good:** ```python logger.debug("user_id=<%s>, action=<%s> | user performed action", user_id, action) -logger.info("request_id=<%s>, duration_ms=<%d> | request completed", request_id, duration) -logger.warning("attempt=<%d>, max_attempts=<%d> | retry limit approaching", attempt, max_attempts) -``` - -**Bad:** -```python -logger.debug(f"User {user_id} performed action {action}") # Don't use f-strings -logger.info("Request completed in %d ms.", duration) # Don't add punctuation ``` ### Type Annotations -All code must include type annotations: -- Function parameters and return types required -- No implicit optional types -- Use `typing` or `typing_extensions` for complex types -- Mypy strict mode enforced +All code is fully type-annotated (mypy strict enforces parameter/return types and rejects implicit optionals). Use `typing` / `typing_extensions` for complex types. Beyond what the type checker catches: + +- **Spell optionals as PEP 604 unions, not `Optional[...]`.** Write `X | None` and `X | Y`, never `Optional[X]` / `Union[X, Y]` — including for forward references, where ruff's `UP045` will *not* auto-fix it for you. The codebase is ~99% `X | None` already; don't reintroduce `Optional`. +- **Scope every type suppression to a code.** Use `# type: ignore[code]` (no bare `# type: ignore`, no space before the bracket — `# type: ignore [assignment]` silently degrades to a bare ignore). mypy runs with `warn_unused_ignores`, so a coded ignore stays narrow and self-cleans when the underlying error goes away. When narrowing an overridden signature (e.g. `update_config`), the code is `[override]`. +- **Avoid `Any` without a good reason.** Reach for a precise type or a `Protocol` first. ```python def process_message(content: str, max_tokens: int | None = None) -> AgentResult: ... ``` +### Data Structures + +Choose the construct by the role the data plays — don't model the same kind of thing two ways: + +- **`TypedDict` for wire / message / config shapes and `**kwargs` option bags** — anything that is fundamentally a JSON-serializable dict or a keyword-option payload (the Bedrock-API-modeled `content`/`streaming`/`tools` types, and every model-provider `*Config`). Use `total=False` for option/config bags and mark the few mandatory keys with `typing_extensions.Required`; for required wire shapes keep the default `total=True` and mark optional keys with `typing_extensions.NotRequired`. +- **`@dataclass` for runtime objects that own behavior, defaults, or serialization helpers** — results, contexts, events, state, persistence models (e.g. `agent/agent_result.py`). +- **`pydantic BaseModel` only for schemas the model reads or writes** — structured output and tool-input validation (e.g. `vended_plugins/goal/judge.py`). Do not reach for pydantic to model internal data. +- **Do not introduce `NamedTuple`** for new data structures. + +Canonical examples: `types/content.py` (TypedDict), `agent/agent_result.py` (dataclass), `vended_plugins/goal/judge.py` (BaseModel). + +### Naming Conventions + +- **Variables/Functions**: `snake_case` +- **Classes**: `PascalCase` +- **Constants**: `UPPER_SNAKE_CASE` +- **Private members, functions, and modules**: prefix with `_` (e.g. `_helper()`, `_internal.py`) +- **Name every variable for its content**, including short-lived loop/catch bindings — `event`, `index`, `error`, never `e`/`i`/`x`. + ### Docstrings -Use Google-style docstrings for all public functions, classes, and modules: +Use Google-style docstrings for all public functions, classes, and modules. ruff's pydocstyle (`D`) enforces that a docstring *exists* and is well-formed, but **not** that its sections are complete — so the judgment rule is: **include a `Raises:` section whenever a public function raises as part of its contract** (a missing one passes lint and is caught only in review). ```python def example_function(param1: str, param2: int) -> bool: @@ -143,28 +140,7 @@ def example_function(param1: str, param2: int) -> bool: ### Import Organization -Imports must be at the top of the file. - -Imports are automatically organized by ruff/isort: -1. Standard library imports -2. Third-party imports -3. Local application imports - -Use absolute imports for cross-package references, relative imports within packages. - -```python -# Standard library -import logging -from typing import Any - -# Third-party -import boto3 -from pydantic import BaseModel - -# Local -from strands.agent import Agent -from .tools import Tool -``` +Imports go at the top of the file; ruff/isort orders them (stdlib → third-party → local). Use absolute imports for cross-package references, relative imports within a package. ### File Organization @@ -174,60 +150,70 @@ from .tools import Tool - Private modules prefixed with `_` - Test files prefixed with `test_` -### Naming Conventions +### Public API Surface -- **Variables/Functions**: `snake_case` -- **Classes**: `PascalCase` -- **Constants**: `UPPER_SNAKE_CASE` -- **Private members, functions, and modules**: Prefix with `_` (e.g. `_helper()`, `_internal.py`) +Public/internal marking is a cross-SDK rule — see the [root AGENTS.md](../AGENTS.md). The Python-idiomatic form: -### Error Handling +- **Every public package declares its surface with an explicit `__all__`** in its `__init__.py` (entries listed alphabetically). Internal (`_`-prefixed) and namespace-only packages may omit it. Don't lean on the redundant `import X as X` re-export alias in place of `__all__` for a public package. +- **Optional or heavy model providers stay out of `__all__`** and are lazy-loaded via a module-level `__getattr__` (see `models/__init__.py`) so importing the package doesn't pull every provider's third-party dependency. -- Use custom exceptions from `strands.types.exceptions` -- Provide clear error messages with context -- Don't swallow exceptions silently +### Error Handling -## Testing Patterns +- Use custom exceptions from `strands.types.exceptions`; provide clear, context-rich messages; never swallow exceptions silently. +- **Reserve a bare `raise Exception(...)` for nothing — pick the specific type.** A malformed model response is a `ValueError` (the convention across every provider); a timeout is a `TimeoutError`. When you re-raise, preserve the cause with `raise NewError(...) from original`. +- **Model providers translate vendor errors into the SDK's typed exceptions.** A vendor SDK's context-window / token-overflow error becomes `ContextWindowOverflowException`; rate-limiting / throttling (HTTP 429) becomes `ModelThrottledException`. Preserve the original with `raise ... from error`; never let a raw vendor exception escape the provider boundary. Most providers already do this (see `openai.py`, `bedrock.py`, `anthropic.py`); match them when adding or touching one. -### Unit Tests (`tests/`) +### Extensible Interfaces (Protocol vs ABC) -- Mirror the `src/strands/` structure exactly -- Focus on isolated component testing -- Use mocking for external dependencies (models, AWS services) -- Use fixtures from `tests/fixtures/` (e.g., `mocked_model_provider.py`) +> **Before adding or changing a public interface, read [docs/STYLE_GUIDE.md](./docs/STYLE_GUIDE.md).** It carries two rules a linter cannot enforce, summarized here so you don't miss them: -```python -# tests/strands/agent/test_agent.py mirrors src/strands/agent/agent.py -``` +- **Use a `Protocol` with `**kwargs`, not a `Callable`, for extensible interfaces.** A `Protocol` lets you add optional keyword arguments later without breaking implementors; a bare `Callable` signature can't evolve. Example: `def __call__(self, state: GraphState, **kwargs: Any) -> bool: ...`. +- **Reference a tool by its `tool_name` property, never a hardcoded string.** `tool.tool_name` stays correct when a tool is renamed; a string literal silently rots. -### Integration Tests (`tests_integ/`) +### Hook Events -- End-to-end testing with real model providers -- Require credentials/API keys (set via environment variables) -- Organized by feature area +> **Before adding a hook event, read [docs/HOOKS.md](./docs/HOOKS.md).** Event names take an `Event` suffix, paired events follow `Before{Action}Event` / `After{Action}Event` (the action word comes *after* the lifecycle word), and every `Before` has a matching `After` invoked in reverse registration order. Hook event names are shared with the TypeScript SDK — see the cross-SDK parity rule in the [root AGENTS.md](../AGENTS.md). -### Test File Naming +## Adding a Model Provider -- Unit tests: `test_{module}.py` in `tests/strands/{path}/` -- Integration tests: `test_{feature}.py` in `tests_integ/` +A model provider is the single most-repeated pattern in the SDK. Follow the existing skeleton — `models/anthropic.py` is the cleanest reference. -### Running Tests +- **One flat file** `models/.py` defining `class XModel(Model)`. (`models/model.py` is the abstract base; `models/_validation.py` and `models/_defaults.py` hold the shared helpers.) +- **Config is a nested `TypedDict`** `class XConfig(BaseModelConfig, total=False)` declared inside the class, with mandatory keys marked `typing_extensions.Required` (e.g. `model_id: Required[str]`). +- **`__init__(self, *, , **model_config: Unpack[XConfig])`** calls `validate_config_keys(model_config, self.XConfig)` *before* storing, then stores the typed config (`self.config = XModel.XConfig(**model_config)` — store the `TypedDict`, don't keep a plain `dict` and `cast` later). +- **Implement the abstract contract** from `Model`: `update_config` (re-validate, then `self.config.update(...)`; decorate `@override` and suppress with `# type: ignore[override]` because it narrows `**kwargs`), `get_config` (return `resolve_config_metadata(self.config, self.config["model_id"])`), `stream`, and `structured_output`. +- **Conversion helpers are conventional**, not abstract: `format_request(...)` builds the vendor request and `format_chunk(event) -> StreamEvent` maps a vendor chunk to a `StreamEvent`. +- **Map vendor errors** to typed SDK exceptions and chain the cause (see Error Handling above). -```bash -hatch test # Run unit tests -hatch test -c # Run with coverage -hatch run test-integ # Run integration tests -hatch test tests/strands/agent/ # Run specific directory -hatch test --all # Test all Python versions (3.10-3.13) +```python +class XModel(Model): + class XConfig(BaseModelConfig, total=False): + model_id: Required[str] + params: dict[str, Any] | None + + def __init__(self, *, client_args: dict[str, Any] | None = None, **model_config: Unpack[XConfig]) -> None: + validate_config_keys(model_config, self.XConfig) + self.config = XModel.XConfig(**model_config) + self.client = vendor.AsyncClient(**(client_args or {})) + + @override + def update_config(self, **model_config: Unpack[XConfig]) -> None: # type: ignore[override] + validate_config_keys(model_config, self.XConfig) + self.config.update(model_config) + + @override + def get_config(self) -> XConfig: + return resolve_config_metadata(self.config, self.config["model_id"]) ``` -### Writing Tests +### Async & Streaming -- Use pytest fixtures for setup/teardown -- Use `moto` for mocking AWS services -- Use `pytest.mark.asyncio` for async tests -- Keep tests focused and independent -- Import packages at the top of the test files +- **`stream` is an async generator.** Annotate provider overrides `-> AsyncGenerator[StreamEvent, None]` (the abstract base widens the return to `AsyncIterable`); annotate `structured_output` `-> AsyncGenerator[dict[str, T | Any], None]`. +- **Everything after `system_prompt` in a `stream` override is keyword-only — place a bare `*` before `tool_choice`.** Every provider does this; it keeps the orchestrator's keyword call site stable as new options are added. +- **Never buffer a stream into a list before yielding.** Consume upstream async iterables with `async for` and yield as you go. +- **Async-first with thin sync facades.** Public APIs come in async form (`invoke_async`, `stream_async`) plus thin sync wrappers (`__call__`, `invoke`) that delegate via `_async.run_async`. Never block the event loop — wrap blocking or third-party-sync calls in `asyncio.to_thread` (see `bedrock.py`). +- **The floor is Python 3.10.** Do not use `asyncio.TaskGroup` or `asyncio.timeout` (3.11+) without a `sys.version_info` gate and a 3.10 fallback (see `multiagent/swarm.py`); for concurrent work use `asyncio.create_task` and clean up in `finally` with `task.cancel()` then `await asyncio.gather(*tasks, return_exceptions=True)`. +- **Cancellation is cooperative.** It uses an internal `threading.Event` set by `agent.cancel()` and checked at turn/tool boundaries (see `agent/_concurrency.py`); the SDK does not accept an external cancel token. *(The TypeScript SDK uses `AbortSignal` instead — see its AGENTS.md.)* ## MCP Tasks (Experimental) @@ -283,26 +269,35 @@ Task-augmented execution is used when ALL conditions are met: - `tests_integ/mcp/test_mcp_client_tasks.py` - Integration tests - `tests_integ/mcp/task_echo_server.py` - Test server with task support -## Things to Do +## Testing + +> **Before writing tests, read [docs/TESTING.md](./docs/TESTING.md).** It is the authoritative reference for the shared fixtures (reuse them — don't hand-roll mock models/hooks), the `tru_`/`exp_` assertion-pair naming, whole-object assertions, and test organization. The summary below is just the entry point. + +- **Unit tests** in `tests/` mirror the `src/strands/` structure exactly (`tests/strands/agent/test_agent.py` ↔ `src/strands/agent/agent.py`); use mocking for external dependencies and fixtures from `tests/fixtures/`. +- **Integration tests** in `tests_integ/` exercise real providers end to end, require credentials, and are organized by feature area. +- **File naming**: `test_{module}.py` in `tests/strands/{path}/`; `test_{feature}.py` in `tests_integ/`. +- **Writing tests**: pytest fixtures for setup/teardown, `moto` for AWS, `@pytest.mark.asyncio` on every async test (strict mode), keep tests focused and independent, import packages at the top of the file. + +## Quick Rules -- Use explicit return types for all functions -- Write Google-style docstrings for public APIs -- Use structured logging format -- Add type annotations everywhere -- Use relative imports within packages -- Mirror src/ structure in tests/ -- Run `hatch fmt --formatter` and `hatch fmt --linter` before committing -- Follow conventional commits (`feat:`, `fix:`, `docs:`, etc.) +The un-lintable rules an agent most often misses — every one is **enforced by review**, not tooling (formatting, import order, line length, docstring presence, and type-annotation coverage are already handled by ruff/mypy; run `hatch fmt`). Cross-SDK rules (plugin naming, parity, public/internal marking, logging format, evergreen comments) are in the [root AGENTS.md](../AGENTS.md). -## Things NOT to Do +**Do:** +- Spell optionals `X | None` (never `Optional[X]`); scope suppressions `# type: ignore[code]` +- Choose the data construct by role (TypedDict / dataclass / pydantic) — see Data Structures +- Document a `Raises:` section when a public function raises as part of its contract +- Declare a public package's surface with an explicit, alphabetized `__all__` +- Use a `Protocol` with `**kwargs` (not `Callable`) for extensible interfaces; compare tools by `tool_name`, not a string literal +- Name hook events `…Event` with `Before{Action}Event`/`After{Action}Event` pairing (read `docs/HOOKS.md` first) +- In a `stream` override, put a bare `*` before `tool_choice`; type it `AsyncGenerator[StreamEvent, None]` +- Wrap blocking/sync work in `asyncio.to_thread`; gate any 3.11+ asyncio API behind `sys.version_info` +- Name every variable for its content, even short-lived loop/catch bindings (`event`/`index`/`error`, never `e`/`i`/`x`) -- Don't use f-strings in logging calls -- Don't use `Any` type without good reason -- Don't skip type annotations -- Don't put unit tests outside `tests/strands/` structure -- Don't commit without running pre-commit hooks -- Don't add punctuation to log messages -- Don't use implicit optional types +**Don't:** +- `raise Exception(...)` — raise a specific type and chain the cause with `from` +- Let a raw vendor error escape a provider boundary — translate to a typed SDK exception +- Keep a model-provider config as a plain `dict` + `cast` — store the `TypedDict` +- Buffer a stream into a list before yielding ## Development Commands @@ -342,10 +337,7 @@ hatch build # Build package ### Code Comments -- Comments should explain WHAT the code does or WHY it exists -- NEVER add comments about what used to be there or how something changed -- NEVER refer to temporal context ("recently refactored", "moved") -- Keep comments concise and evergreen +Comments explain WHAT/WHY and stay evergreen — the full rule (including how it applies to tests, and the deprecated/legacy nuance) is in the [root AGENTS.md](../AGENTS.md). ### Code Review Considerations @@ -361,6 +353,7 @@ hatch build # Build package - [CONTRIBUTING.md](../CONTRIBUTING.md) - Human contributor guidelines - [docs/](./docs/) - Developer documentation - [STYLE_GUIDE.md](./docs/STYLE_GUIDE.md) - Code style conventions + - [TESTING.md](./docs/TESTING.md) - Testing reference - [HOOKS.md](./docs/HOOKS.md) - Hooks system guide - - [PR.md](../team/PR.md) - PR description guidelines - [MCP_CLIENT_ARCHITECTURE.md](./docs/MCP_CLIENT_ARCHITECTURE.md) - MCP threading design +- [PR.md](../team/PR.md) - PR description guidelines diff --git a/strands-py/docs/TESTING.md b/strands-py/docs/TESTING.md new file mode 100644 index 0000000000..12be006aab --- /dev/null +++ b/strands-py/docs/TESTING.md @@ -0,0 +1,78 @@ +# Testing Guidelines - Strands Python SDK + +> **IMPORTANT**: When writing tests, you **MUST** follow the guidelines in this document. They keep tests consistent, maintainable, and resilient to unrelated changes. + +This document is the authoritative testing reference for the Python SDK. For general development guidance, see [AGENTS.md](../AGENTS.md). It is the counterpart to the TypeScript SDK's [docs/TESTING.md](../../strands-ts/docs/TESTING.md); where a convention is shared, the two should stay aligned. + +## Test Layout + +- **Unit tests** mirror `src/strands/` exactly under `tests/strands/` — `tests/strands/agent/test_agent.py` tests `src/strands/agent/agent.py`. Do not put unit tests anywhere else. +- **Integration tests** live in `tests_integ/`, organized by feature area, and run only in CI (they need real provider credentials). Whole-message-dict assertions in integ tests are especially brittle — see [Assertions](#assertions). +- **Test files are prefixed `test_`**; test functions are prefixed `test_`. + +```bash +hatch test # Run unit tests +hatch test -c # Run with coverage +hatch test tests/strands/agent/ # Run a specific directory +hatch run test-integ # Run integration tests +hatch test --all # All Python versions (3.10–3.14) +``` + +## Test Fixtures Quick Reference + +Reusable fixtures live in `tests/fixtures/`; shared async/AWS helpers are session-scoped fixtures in `tests/conftest.py`. **Reuse these — do not hand-roll a mock model, hook recorder, or tool.** If you find yourself writing one, check here first. + +| Fixture / helper | Location | When to use | +| --- | --- | --- | +| `MockedModelProvider` | `tests/fixtures/mocked_model_provider.py` | Drive the agent loop with a pre-defined sequence of responses (optionally with per-response `Usage` for metrics paths) instead of a real provider | +| `MockHookProvider` | `tests/fixtures/mock_hook_provider.py` | Record hook invocations and assert which lifecycle events fired | +| `MockMultiAgentHookProvider` | `tests/fixtures/mock_multiagent_hook_provider.py` | Same, for multi-agent lifecycle events | +| `MockAgentTool` | `tests/fixtures/mock_agent_tool.py` | A stand-in `AgentTool` when a tool is incidental to the test | +| `MockedSessionRepository` | `tests/fixtures/mock_session_repository.py` | In-memory `SessionRepository` for session/persistence tests | +| `agenerator` | `tests/conftest.py` | Wrap a list as an async generator (feed a `MockedModelProvider.stream` or any `async for`) | +| `alist` | `tests/conftest.py` | Collect an async iterable into a list (`events = await alist(agent.stream_async(...))`) | +| `generate` | `tests/conftest.py` | Drive a sync generator to exhaustion, returning `(yielded_events, return_value)` | +| `moto_env`, `moto_mock_aws` | `tests/conftest.py` | Mock AWS credentials/services with `moto` — never hit real AWS in a unit test | + +## Async Tests + +Pytest runs in **strict asyncio mode** (the default — `asyncio_mode` is not set). Every coroutine test **MUST** carry an explicit marker: + +```python +@pytest.mark.asyncio +async def test_streams_tool_use(agenerator): + ... +``` + +A coroutine test without the marker is silently skipped, not run. Consume agent/model streams with the `alist` fixture rather than re-implementing the `async for` collection loop. + +## Assertions + +**Name the assertion pair `tru_` / `exp_`.** This is the one sanctioned exception to the "name every variable for its content" rule — the matched prefixes make arrange/assert pairs scannable. Compute the actual value into `tru_`, define the expected value as `exp_`, then compare: + +```python +tru_result = agent("hello") +exp_result = {"role": "assistant", "content": [{"text": "hi"}]} +assert tru_result == exp_result +``` + +**Assert whole objects, not field-by-field.** A single equality on the whole structure documents the expected shape and catches unexpected extra fields. Mask only genuinely volatile fields (timestamps, generated ids, metadata) with `unittest.mock.ANY`: + +```python +tru_message = result.message +exp_message = {"role": "assistant", "content": [{"text": "abc"}], "metadata": unittest.mock.ANY} +assert tru_message == exp_message +``` + +**Do not assert a full `Message` dict when the test only cares about part of it.** If the test is about the text or role, assert just that (`assert result.message["role"] == "assistant"`). Asserting the entire `Message` shape couples the test to fields it doesn't care about, so an unrelated change to the `Message` type breaks it — this is a recurring source of breakage in `tests_integ/`. + +## Test Organization + +- **Default to flat, module-level `test_` functions.** Most of the suite is flat. +- **Use a `class Test` only when tests share class-scoped setup or fixtures** — not merely to group related cases (a module already groups them). +- **Parametrize repetitive cases** with `@pytest.mark.parametrize` instead of copy-pasting a test body across inputs. +- **Keep tests focused and independent**; import packages at the top of the file. + +## Comments in Tests + +The evergreen-comment rule applies here too: a regression test should **link the issue and describe the behavior it now guarantees**, not narrate what the code used to do. Prefer `# guards against unbounded retry on a None tool name (#642)` over `# we used to hang on None here`. diff --git a/strands-ts/AGENTS.md b/strands-ts/AGENTS.md index 3107326ce6..2df6d48929 100644 --- a/strands-ts/AGENTS.md +++ b/strands-ts/AGENTS.md @@ -2,15 +2,47 @@ This document provides guidance for AI agents working on the Strands TypeScript SDK (`strands-ts/`). For human contributor guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md). +> **Cross-SDK rules live in the [root AGENTS.md](../AGENTS.md).** Plugin naming, cross-SDK parity, public/internal API marking, the structured-logging format, and the evergreen-comment rule apply to both SDKs and are stated once there — this file shows only the TypeScript-idiomatic form and the rules unique to TypeScript. When a rule applies to both SDKs, edit the root, not this file. + +## Product Overview + +The Strands TypeScript SDK is the TypeScript/JavaScript port of the Strands Agents framework for building AI agents with a model-driven approach. It mirrors the Python SDK in concepts and names (re-cased to TS idiom) while being idiomatic TypeScript, and runs in both Node.js and browser environments. + +**Core Features:** +- Model Agnostic: Multiple model providers (Amazon Bedrock, Anthropic, OpenAI, Google, Vercel AI SDK, etc.) +- Tools: `tool()` factory with Zod schemas +- MCP Integration: Native Model Context Protocol support +- Multi-Agent Systems: Agent-to-agent, swarms, and graph patterns +- Streaming Support: Real-time response streaming via async generators +- Hooks: Event-driven extensibility for agent lifecycle +- Session Management: Pluggable session managers +- Observability: OpenTelemetry tracing and metrics + +## Directory Structure + +``` +strands-ts/ +├── src/ # All production source code +│ ├── agent/ # Core agent loop, state, agent-as-tool +│ ├── models/ # Model provider implementations (anthropic.ts, bedrock.ts, openai/, google/, …) +│ ├── tools/ # Tool system (registry, execution, factory) +│ ├── conversation-manager/ # Message history, compression +│ ├── multiagent/ # Multi-agent orchestration (A2A, graphs, swarm) +│ ├── session/ # Session persistence +│ ├── telemetry/ # Tracing and metrics +│ ├── hooks/ # Agent lifecycle hooks +│ └── types/ # Shared type definitions +├── src/**/__tests__/ # Unit tests (co-located with source) +├── test/integ/ # Integration tests with real providers +├── docs/ # Developer documentation +└── package.json # SDK package config, dependencies, npm scripts +``` + ## Development Workflow ### 1. Environment Setup -See [CONTRIBUTING.md - TypeScript SDK](../CONTRIBUTING.md#typescript-sdk) for: - -- Prerequisites (Node.js 20+, npm) -- Installation steps -- Verification commands +See [CONTRIBUTING.md - TypeScript SDK](../CONTRIBUTING.md#typescript-sdk) for prerequisites (Node.js 20+, npm), installation, and verification commands. ### 2. Making Changes @@ -35,237 +67,186 @@ See [PR.md](../team/PR.md) for the complete guidance and template. ### 4. Quality Gates -Pre-commit hooks automatically run: - -- Build (via npm run build, required for workspace type resolution) -- Unit tests with coverage (via npm run test:coverage) -- Linting (via npm run lint) -- Format checking (via npm run format:check) -- Type checking (via npm run type-check) +Pre-commit hooks run automatically and must all pass: build (`npm run build`), unit tests with coverage (`npm run test:coverage`), linting (`npm run lint`), format check (`npm run format:check`), and type check (`npm run type-check`). -All checks must pass before commit is allowed. +**ESLint and Prettier already enforce formatting (no semicolons, single quotes, 120 cols), the `no-explicit-any` and `explicit-function-return-type` rules, and TSDoc syntax.** This guide does not re-list those — it covers the conventions a linter *cannot* check. When a rule below says "enforced by review," it means exactly that: no tool catches it, so it is on you and the reviewer. ### 5. Testing Guidelines -When writing tests, you **MUST** follow the guidelines in [docs/TESTING.md](./docs/TESTING.md). Key topics covered: +When writing tests, you **MUST** follow the guidelines in [docs/TESTING.md](./docs/TESTING.md): test organization and file location, batching strategy, object-assertion best practices, coverage requirements, and multi-environment (Node.js + browser) testing. Canonical paths: unit tests co-located under `src/**/__tests__/`, integration tests under `test/integ/`. -- Test organization and file location -- Test batching strategy -- Object assertion best practices -- Test coverage requirements -- Multi-environment testing (Node.js and browser) +## Coding Patterns and Best Practices -See [TESTING.md](./docs/TESTING.md) for the complete testing reference. +### Logging -## Coding Patterns and Best Practices +Structured logging is a cross-SDK rule — see the format in the [root AGENTS.md](../AGENTS.md). The TypeScript-idiomatic form uses template literals (never printf `%s`/`%d`): -### Logging Style Guide +```typescript +logger.warn(`stop_reason=<${stopReason}>, fallback=<${fallback}> | unknown stop reason, converting to camelCase`) +logger.warn('cache points are not supported in openai system prompts, ignoring cache points') // no context fields +``` -The SDK uses a structured logging format consistent with the Python SDK for better log parsing and searchability. +### TypeScript Type Safety -**Format**: +**Optional chaining for null safety**: prefer optional chaining over verbose `typeof` checks when accessing potentially undefined properties: ```typescript -// With context fields -logger.warn(`field1=<${value1}>, field2=<${value2}> | human readable message`) - -// Without context fields -logger.warn('human readable message') +// Good +return globalThis?.process?.env?.API_KEY -// Multiple statements in message (use pipe to separate) -logger.warn(`field=<${value}> | statement one | statement two`) +// Bad +if (typeof process !== 'undefined' && typeof process.env !== 'undefined') { + return process.env.API_KEY +} +return undefined ``` -**Guidelines**: +**Strict requirements** (beyond what ESLint catches): -1. **Context Fields** (when relevant): - - Add context as `field=` pairs at the beginning - - Use commas to separate pairs - - Enclose values in `<>` for readability (especially helpful for empty values: `field=<>`) - - Use template literals for string interpolation +- **Function and method *signatures* MUST have explicit return types** (ESLint enforces this). Let TypeScript infer the types of *local variables and intermediate expressions* — don't annotate those redundantly. +- Never use `any` (enforced by ESLint). Reach for `unknown` (then narrow) when a type is genuinely open. +- Use TypeScript strict-mode features. -2. **Messages**: - - Add human-readable messages after context fields - - Use lowercase for consistency - - Avoid punctuation (periods, exclamation points) to reduce clutter - - Keep messages concise and focused on a single statement - - If multiple statements are needed, separate them with pipe character (`|`) - -**Examples**: +**`interface` vs `type` alias**: Use an `interface` for object shapes — model configs, `*Options`/`*Config` bags, data blocks, capability contracts — and prefer `extends` for composition (even when extending an interface in the same file). Use a `type` alias only when the type is *not* a plain object shape: unions (including discriminated unions), string-literal unions, intersections, function types, or mapped/conditional/utility-derived types. ```typescript -// Good: Context fields with message -logger.warn(`stop_reason=<${stopReason}>, fallback=<${fallback}> | unknown stop reason, converting to camelCase`) -logger.warn(`event_type=<${eventType}> | unsupported bedrock event type`) +// Good — object shape as an interface, composed with extends +interface AnthropicModelConfig extends BaseModelConfig { maxTokens?: number } +interface AnthropicModelOptions extends AnthropicModelConfig { apiKey?: string } + +// Good — type alias for a discriminated union (not an object shape) +type OpenAIModelOptions = + | ({ api?: 'responses' } & OpenAIResponsesConfig & OpenAIClientOptions) + | ({ api: 'chat' } & OpenAIChatConfig & OpenAIClientOptions) +``` -// Good: Simple message without context fields -logger.warn('cache points are not supported in openai system prompts, ignoring cache points') +The model-provider config family (`BaseModelConfig` and each `{Provider}ModelConfig`/`{Provider}ModelOptions`) is the reference pattern. A handful of older object shapes are still `type X = { ... }` (e.g. `AgentConfig`, `ConversationManagerOptions`, `SessionStorage`) — migrate them to `interface` on next touch, don't rewrite them en masse. -// Good: Multiple statements separated by pipes -logger.warn(`request_id=<${id}> | processing request | starting validation`) +**Optional properties**: `tsconfig` enables `exactOptionalPropertyTypes`, which makes `prop?: T` and `prop?: T | undefined` semantically distinct. Write the bare `prop?: T` form (the overwhelming majority); only add `| undefined` when a caller genuinely needs to pass an explicit `undefined`. -// Bad: Not using angle brackets for values -logger.warn(`stop_reason=${stopReason} | unknown stop reason`) +### Naming Conventions -// Bad: Using punctuation -logger.warn(`event_type=<${eventType}> | Unsupported event type.`) -``` +- **Private class fields**: underscore prefix (`private readonly _config: Config`, `private _state: State`). +- **Name every variable for its content**, including short-lived loop/catch bindings — `event`, `index`, `error`, never `e`/`i`/`x` (e.g. `for (const message of messages)`, not `for (const m of messages)`). +- **Plugin / construct naming** and **cross-SDK literal parity** are cross-SDK rules — see the [root AGENTS.md](../AGENTS.md). In short: name a construct for what it does (`AgentSkills`, not `AgentSkillsPlugin`); match the Python literal re-cased to TS idiom; single-word string-literal *values* are byte-identical, multi-word ones are camelCased in TS (`tool_use` → `toolUse`, via `STOP_REASON_MAP` / `snakeToCamel`, never emitted ad hoc); wire field *names* keep their wire format (`inputSchema`, `tool_use_id`). ### Import Organization -Organize imports in this order: +Order imports: (1) external dependencies, (2) internal modules via relative paths, (3) type-only imports (`import type { … }`). -```typescript -// 1. External dependencies -import { something } from 'external-package' +### File Organization -// 2. Internal modules (using relative paths) -import { Agent } from '../agent' -import { Tool } from '../tools' +- **Unit tests co-located** under `__tests__/` beside the source (`src/module.ts` ↔ `src/__tests__/module.test.ts`); integration tests under `test/integ/`. +- **Function ordering within a file** reads top-down, most general to most specific: main entry-point/exported functions at the top, private helpers below in order of use. +- **Keep functions small and focused** — a single responsibility each. -// 3. Types (if separate) -import type { Options, Config } from '../types' -``` +### Public API Surface -### File Organization Pattern +Public/internal marking is a cross-SDK rule — see the [root AGENTS.md](../AGENTS.md). The TypeScript-idiomatic form: -**For source files**: +- **Named exports only — never `export default`.** The codebase has zero default exports. +- **Convey module privacy by *not* re-exporting from the barrel (`index.ts`)**, not by an `_`-prefixed filename. Unlike the Python SDK, TS has no `_`-prefixed module files; an internal-but-cross-module symbol gets the `@internal` TSDoc tag instead (see the `CancelledError` intentional-omission comment in `index.ts`). +- **Source and test files are kebab-case** (`tool-caller.ts`, `agent-skills.ts`); the class/type *inside* keeps its PascalCase. No linter checks filename casing, so this is a review item. -``` -strands-ts/src/ -├── module.ts # Source file -└── __tests__/ - └── module.test.ts # Unit tests co-located -``` +### Documentation Requirements -**Function ordering within files**: +TSDoc is required on all exported functions, classes, and interfaces (syntax enforced by ESLint). The judgment rules a linter can't check: -- Functions MUST be ordered from most general to most specific (top-down reading) -- Public/exported functions MUST appear before private helper functions -- Main entry point functions MUST be at the top of the file -- Helper functions SHOULD follow in order of their usage +- Include `@param` for all parameters and `@returns` for return values. +- Include `@example` only for exported classes (main SDK entry points like `BedrockModel`, `Agent`); do **NOT** add `@example` to type definitions, interfaces, or internal types. +- Include `@throws` on an exported function/method that can throw as part of its contract — name the error type and the condition (the TS counterpart to Python's `Raises:`). +- Interface properties MUST have single-line descriptions. +- Mark non-public exports with the `@internal` tag (it may also be applied to non-exported symbols for clarity). -**For integration tests**: +### Dependency Management -``` -strands-ts/test/integ/ -└── feature.test.ts # Tests public API -``` +When adding or modifying dependencies, you **MUST** follow [docs/DEPENDENCIES.md](./docs/DEPENDENCIES.md). Key points: -### TypeScript Type Safety +- **`dependencies`**: core SDK functionality users don't interact with directly +- **`peerDependencies`**: dependencies that cross API boundaries (users construct/pass instances) +- **`devDependencies`**: build tools, testing frameworks, linters — not shipped to users -**Optional chaining for null safety**: Prefer optional chaining over verbose `typeof` checks when accessing potentially undefined properties: +**Rule**: if a dependency crosses an API boundary, it **MUST** be a peer dependency. -```typescript -// Good: Optional chaining -return globalThis?.process?.env?.API_KEY +### Error Handling -// Bad: Verbose typeof checks -if (typeof process !== 'undefined' && typeof process.env !== 'undefined') { - return process.env.API_KEY -} -return undefined -``` - -**Strict requirements**: +- Handle errors explicitly; prefer optional chaining over verbose `typeof` checks for possibly-undefined access. +- **Model providers translate vendor errors into the SDK's typed errors** — context-window / token overflow → `ContextWindowOverflowError`, throttling / 429 → `ModelThrottledError` — and **preserve the original via the `cause` option** (`new ModelThrottledError(message, { cause: error })`). Never let a raw vendor error escape the provider boundary (see `models/openai/errors.ts`, `models/google/errors.ts`). +- Document thrown errors with `@throws` (see Documentation Requirements). -- Always provide explicit return types -- Never use `any` type (enforced by ESLint) -- Use TypeScript strict mode features -- Leverage type inference where appropriate +## Adding a Model Provider -### Class Field Naming Conventions +The model provider is the most-repeated pattern in the SDK; `models/anthropic.ts` is the cleanest reference. -**Private fields**: Use underscore prefix for private class fields. +- **`export class XModel extends Model`** with `private _config` and `private _client`. +- **Two interfaces**: `interface XModelConfig extends BaseModelConfig` (the model knobs) and a superset `interface XModelOptions extends XModelConfig` adding client/credential fields (`apiKey`, `client`, `clientConfig`). The constructor takes `XModelOptions`, destructures the client fields, and spreads the rest into `_config`. +- **Implement the base contract** from `models/model.ts`: `updateConfig(modelConfig)` (spread-merge into `_config`), `getConfig()` (return `resolveConfigMetadata(this._config, this._config.modelId ?? )`), and `async *stream(...)`. `structuredOutput` is **not** part of the TS provider contract — TS does structured output via a tool + agent config (`tools/structured-output-tool.ts`, `AgentConfig.structuredOutputSchema`), not a provider method. +- **Keep conversion private** (`_formatRequest`, `_formatMessages`, `_formatContentBlock`). Default to a **flat `models/.ts`**; only reach for a `models//` subdirectory with standalone adapter modules + `types.ts`/`errors.ts` when the provider spans **multiple API surfaces** (the reason `openai/` exists — chat vs responses). ```typescript -// Good: Private fields with underscore prefix -export class Example { - private readonly _config: Config - private _state: State +export interface XModelConfig extends BaseModelConfig { + maxTokens?: number } - -// Bad: No underscore for private fields -export class Example { - private readonly config: Config +export interface XModelOptions extends XModelConfig { + apiKey?: string + client?: VendorClient } -``` - -#### Naming Conventions for New Features -When choosing names and constants that match an existing implementation in the Python SDK, use exactly the same literal used in the Python SDK. Wherever we can achieve compatibility, keep the previous convention. +export class XModel extends Model { + private _config: XModelConfig + private _client: VendorClient -#### Plugin Naming + updateConfig(modelConfig: XModelConfig): void { + this._config = { ...this._config, ...modelConfig } + } -Name plugins for what they do, not for the `Plugin` interface they implement. - -```typescript -// Good -export class AgentSkills implements Plugin { ... } -export class DefaultModelRetryStrategy implements Plugin { ... } + getConfig(): XModelConfig { + return resolveConfigMetadata(this._config, this._config.modelId ?? MODEL_DEFAULTS.x.modelId) + } -// Bad -export class AgentSkillsPlugin implements Plugin { ... } + async *stream(messages: Message[], options?: StreamOptions): AsyncIterable { + // map vendor errors to ContextWindowOverflowError / ModelThrottledError with { cause } + } +} ``` -### Documentation Requirements +### Async & Streaming -**TSDoc format** (required for all exported functions): +- **`stream` is an async generator** typed `async *stream(messages, options?): AsyncIterable`, matching the abstract base exactly. Consume async iterables with `for await`. +- **Providers emit raw `ModelStreamEvent`s and must not buffer the whole response.** The base `Model.streamAggregated` performs event accumulation on top of `stream`. +- **The SDK is async-only — no sync facade.** `invoke()` returns a `Promise`; there is no blocking equivalent (unlike the Python SDK's `__call__`). +- **Cancellation uses `AbortSignal`.** `agent.cancelSignal` is exposed, and callers may pass an external `cancelSignal` via `invoke`/`stream` options (merged with `AbortSignal.any`). When merging concurrent async generators, race `.next()` with `Promise.race` and close the rest with `Promise.allSettled(gen.return())` in `finally`; use `Promise.all`/`allSettled` for simple fan-out. -- All exported functions, classes, and interfaces must have TSDoc -- Include `@param` for all parameters -- Include `@returns` for return values -- Include `@example` only for exported classes (main SDK entry points like BedrockModel, Agent) -- Do NOT include `@example` for type definitions, interfaces, or internal types -- Interface properties MUST have single-line descriptions -- TSDoc validation enforced by ESLint -- Mark exported symbols that are not part of the public API with the `@internal` TSDoc tag. Use this for exports that other modules need but SDK consumers should not depend on. +## Testing -### Code Style Guidelines +When writing tests, you **MUST** follow [docs/TESTING.md](./docs/TESTING.md) — it is the authoritative reference. In short: -**Formatting** (enforced by Prettier): +- **Unit tests** co-located under `src/**/__tests__/`; **integration tests** under `test/integ/`. +- **File naming** selects the environment: `*.test.ts` (Node + browser), `*.test.node.ts` (Node only), `*.test.browser.ts` (browser only). +- Use the nested `describe` pattern; assert on whole objects with `toEqual`; reuse fixtures from `src/__fixtures__/` rather than hand-rolling mocks. -- No semicolons -- Single quotes -- Line length: 120 characters -- Tab width: 2 spaces -- Trailing commas in ES5 style +## Quick Rules -### Dependency Management - -When adding or modifying dependencies, you **MUST** follow the guidelines in [docs/DEPENDENCIES.md](./docs/DEPENDENCIES.md). Key points: - -- **`dependencies`**: Core SDK functionality that users don't interact with directly -- **`peerDependencies`**: Dependencies that cross API boundaries (users construct/pass instances) -- **`devDependencies`**: Build tools, testing frameworks, linters - not shipped to users - -**Rule**: If a dependency crosses an API boundary, it **MUST** be a peer dependency. +The un-lintable rules an agent most often misses — every one is **enforced by review**, not tooling (formatting, `no-explicit-any`, explicit return types, and TSDoc syntax are already handled by ESLint/Prettier; run `npm run lint`). Cross-SDK rules (plugin naming, parity, public/internal marking, logging format, evergreen comments) are in the [root AGENTS.md](../AGENTS.md). -## Things to Do +**Do:** +- Use an `interface` for object shapes (with `extends`); a `type` alias only for unions/intersections/function/mapped types +- Use **named exports only**; convey privacy by barrel omission + `@internal`, never an `_`-prefixed filename +- Name source/test files kebab-case (the class inside stays PascalCase) +- Document errors with `@throws`; reserve `@example` for entry-point classes, never type defs +- Translate vendor errors to typed SDK errors and preserve `{ cause }` +- Prefer the bare `prop?: T` optional form under `exactOptionalPropertyTypes` +- Name every variable for its content, even short-lived loop/catch bindings (`event`/`index`/`error`, never `e`/`i`/`x`) -- Use relative imports for internal modules -- Co-locate unit tests with source under `__tests__` directories -- Follow nested describe pattern for test organization -- Write explicit return types for all functions -- Document all exported functions with TSDoc -- Use meaningful variable and function names -- Keep functions small and focused (single responsibility) -- Handle errors explicitly - -## Things NOT to Do - -- Use `any` type (enforced by ESLint) -- Put unit tests in separate `tests/` directory (use `strands-ts/src/**/__tests__/**`) -- Skip documentation for exported functions -- Use semicolons (Prettier will remove them) -- Commit without running pre-commit hooks -- Ignore linting errors -- Use implicit return types +**Don't:** +- `export default` — named exports only +- Use printf-style placeholders (`%s`, `%d`) in logs — use template literals +- Put unit tests in a separate `tests/` directory (use `src/**/__tests__/`) +- Annotate local variables with redundant explicit types (infer locals; annotate signatures) +- Let a raw vendor error escape a provider boundary ## Development Commands -Quick reference: - ```bash npm test # Run unit tests in Node.js npm run test:browser # Run unit tests in browser (Chromium via Playwright) @@ -280,7 +261,7 @@ npm run build # Compile TypeScript ## Agent-Specific Notes -### Writing code +### Writing Code - YOU MUST make the SMALLEST reasonable changes to achieve the desired outcome. - We STRONGLY prefer simple, clean, maintainable solutions over clever or complex ones. @@ -288,19 +269,17 @@ npm run build # Compile TypeScript - YOU MUST MATCH the style and formatting of surrounding code, even if it differs from standard style guides. - Fix broken things immediately when you find them. Don't ask permission to fix bugs. -#### Code Comments +### Code Comments -- NEVER add comments explaining that something is "improved", "better", "new", "enhanced", or referencing what it used to be -- Comments should explain WHAT the code does or WHY it exists, not how it's better than something else -- Comments should be evergreen — never reference temporal context +Comments explain WHAT/WHY and stay evergreen — the full rule (including how it applies to tests, and the deprecated/legacy nuance) is in the [root AGENTS.md](../AGENTS.md). ### Integration with Other Files -- **CONTRIBUTING.md**: Contains testing/setup commands and human contribution guidelines -- **docs/TESTING.md**: Comprehensive testing guidelines (MUST follow when writing tests) -- **team/PR.md**: Pull request guidelines and template -- **package.json**: Root workspace config that delegates to strands-ts -- **strands-ts/package.json**: SDK package config, dependencies, and npm scripts +- **CONTRIBUTING.md**: testing/setup commands and human contribution guidelines +- **docs/TESTING.md**: comprehensive testing guidelines (MUST follow when writing tests) +- **docs/DEPENDENCIES.md**: dependency-management rules +- **team/PR.md**: pull request guidelines and template +- **package.json** / **strands-ts/package.json**: workspace and SDK package config ## Additional Resources diff --git a/strands-ts/docs/TESTING.md b/strands-ts/docs/TESTING.md index 40251b948a..c944272601 100644 --- a/strands-ts/docs/TESTING.md +++ b/strands-ts/docs/TESTING.md @@ -45,10 +45,10 @@ src/subdir/ ### Integration Test Location -**Rule**: Integration tests are separate in `tests_integ/` +**Rule**: Integration tests are separate in `test/integ/` ``` -tests_integ/ +test/integ/ ├── api.test.ts # Tests public API └── environment.test.ts # Tests environment compatibility ```