This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Night Brownie is a minimal Python harness acting as an always-on AI co-maintainer for OSS repositories. It manages process lifecycle, credential injection, message routing, and GitHub event polling. All intelligence lives in containerized agents. The harness owns all GitHub API calls — agents only produce decision + action lists over HTTP, so credentials never enter agent containers.
The MVP target: a maintainer installs Night Brownie, configures one repo, and has issues triaged (labeled, responded to, or closed) without writing code — in under 30 minutes.
New features are developed by hashing out an idea. This idea is then turned into a spec. The spec is turned into a plan. The plan is iteratively implemented.
All files for a feature are written in Markdown and live in the docs/specs/<feature-name>/ directory.
uv sync # install all dependency groups (dev, test, docs)
pre-commit install # install git hooksAlways add --agent-digest=term when running pytest to see token-optimized test results.
Use the python-tester skill when writing Python tests.
uv run pytest --agent-digest=term # run all tests with coverage
uv run pytest --agent-digest=term tests/test_config.py # run a single test file
uv run pytest --agent-digest=term tests/test_config.py::test_name # run a single test
uv run pytest --agent-digest=term --no-cov # run tests without coverage
pre-commit run --all-files # run all linters/formattersuv run night-brownie start --config config.yamlThe system follows a strict vertical ownership model:
GitHub API polling (poller.py)
→ Event router (router.py) — maps repo+event_type → agent URL
→ Harness HTTP server (server.py) — fetches memory, builds TaskMessage, POSTs to agent
↔ Agent container (agents/issue-triage/) — returns DecisionMessage
→ Executor (executor.py) — translates actions into GitHub API calls
→ Memory (memory.py) — logs every action, updates per-issue summaries
Key constraint: The harness executes all GitHub API calls.
Agents produce DecisionMessage (decision + action list) — they never call GitHub directly.
Task (harness → agent):
{ "task_id": "uuid4", "type": "issue.triage", "repo": "owner/repo",
"payload": {}, "context": { "memory_summary": "...", "llm_backend": {...} } }Decision (agent → harness):
{ "task_id": "uuid4", "decision": "label_and_respond|close|escalate|skip",
"rationale": "...", "actions": [{"type": "add_label", "label": "bug"}, ...] }night_brownie/
├── config.py # YAML config loader + Pydantic validation; ${VAR} env resolution
├── credentials.py # Env var resolution; get_github_token()
├── server.py # FastAPI — dispatch loop: fetch memory → build task → POST to agent → execute
├── poller.py # asyncio polling loop; concurrent per-repo with semaphore (default max 5)
├── router.py # event_type + repo → RouteTarget (agent URL + merged config)
├── executor.py # DecisionMessage actions → GitHub API calls (via PyGithub/httpx)
├── memory.py # SQLite: action_log + memory_summary tables; WAL mode
├── protocol.py # Pydantic models: TaskMessage, DecisionMessage, ActionItem
└── llm/
├── base.py # Abstract LLMBackend ABC + from_config() factory
├── anthropic.py # Wraps LiteLLM for Anthropic
└── ollama.py # Wraps LiteLLM for Ollama
agents/
└── issue-triage/
├── agent.py # FastAPI: POST /task, GET /health
└── prompts/triage.py
Two tables in ~/.agent-harness/memory.db (path overridable in config):
action_log— every decision logged before executionmemory_summary— per-repo+issue LLM-generated summary injected into task context on next dispatch
SQLite is used directly via stdlib sqlite3 — never mock it in tests;
use a real temp-file DB via pytest tmp_path.
All secrets are ${VAR} environment variable references — the config file itself never contains raw secrets.
See config.example.yaml for the full schema.
- Formatter/linter: ruff (line length 119, Google docstring convention)
- Type checking: mypy (
--no-strict-optional --ignore-missing-imports) - Docstrings: interrogate (≥90% coverage), pydoclint (Google style)
- Type hints: required on all public functions and methods;
--keep-runtime-typing - Python minimum: 3.12
Pre-commit hooks enforce: ruff-format, ruff-check, mypy, pydoclint, interrogate, detect-secrets, pyupgrade, check-yaml, check-toml.
- Framework: pytest + pytest-cov; target ≥85% line / ≥80% branch coverage
- LLM calls: Recorded fixtures in
tests/fixtures/— real responses captured once, replayed in CI. No live LLM calls. - GitHub API calls: Mock PyGithub/httpx at the boundary with pytest-mock.
- SQLite: Use a real in-memory or temp-file DB — never mock it.
- Agent protocol: Integration tests spin up the agent container locally and send real HTTP task messages.
Always automatic:
- Poll configured repos on the set interval
- Inject credentials from environment; never log or expose them
- Write every decision and action to
action_logbefore executing
Require explicit allow_close: true in agent config:
- Closing an issue (default: label + comment only)
Never:
- Call GitHub API as anything other than the configured bot identity
- Store raw secrets in config, logs, or the memory DB
- Execute shell commands or arbitrary code from agent decision payloads