Skills, guardrails, and structural hooks for AI coding agents. Plan, build, test, debug, and ship — any repo, any language.
Best on Claude Code — hooks enforce rules the model cannot bypass.
Also works on Cursor, Codex, Gemini, Windsurf, Aider — via project rules (setup guide).
| Piece | Purpose |
|---|---|
| Skills | Step-by-step workflows — /explore, /implementation, /precommit, … |
| Guardrails | Safety and quality rules (shared/guardrails.md) |
| Hooks | Structural enforcement on Claude Code — block bad writes, gate commits, route skills |
Prompt rules can be ignored. Hooks cannot. On other LLMs you get skills + guardrails via AGENTS.md; you enforce gates manually.
→ System overview · Architecture docs
git clone https://github.com/jvalin17/agent-toolkit.git
cd agent-toolkit && ./install.sh # once — needs python3, jq, Claude Code
cd /path/to/your-project && claude # hooks inject context; look for "AGENT TOOLKIT ACTIVE"
/explore . # understand the codebase
/precommit # before commit (default gate)Natural language works: "fix the login bug" routes to /debug. Chain hands-off: /requirements auto my-app.
Auto-continuation — sessions use a two-layer limit: at 70 min a breadcrumb is saved to HANDOFF.md (session continues); at 200 min (or first compaction) a hard stop fires with a restart prompt. Two ways to run long tasks:
agent-toolkit-continue "Build auth system" # interactive — restarts in same terminal
claude-auto "Build auth system" # headless — for CI/background tasksOr set "continue": true in gates.json for in-hook restart (headless only). Set "continue": false to disable (session will warn but keep running).
Install details & updates: docs/install-and-updates.md
| When | Do this |
|---|---|
| Building | /explore or /requirements → /implementation |
| Committing | /precommit → write findings → finalize_report.py → git commit |
| Pushing (guarded) | /evaluate → finalize → git push |
python3 hooks/finalize_report.py precommit .scratch/precommit_<slug>/findings.jsonWith defaults, only the hook writes reports/ and .gates/ — the agent cannot fake gate files.
→ Full commit/push flows: docs/workflow.md · Gate profiles: shared/gate-unlock.md
| Common | |
|---|---|
/explore |
Understand existing code |
/requirements |
Gather requirements |
/implementation |
Build with TDD |
/precommit |
Quality gate before commit |
/debug |
Hypothesis-driven debugging |
/evaluate |
Quality score (push gate) |
All 13 skills: docs/skills.md
All settings live in gates.json at your project root. Use presets or edit directly.
agent-toolkit-setup --status # show current config
agent-toolkit-setup --balanced # daily dev (default)
agent-toolkit-setup --guarded # production
agent-toolkit-setup --lockdown # strict + all reviews
agent-toolkit-setup --tdd off # toggle one setting| Preset | Commit requires | Push requires | Use when |
|---|---|---|---|
| balanced (default) | /precommit |
— | Daily development |
| guarded | /precommit |
/evaluate |
Production branches |
| lockdown | /precommit + /evaluate |
/evaluate + /reviewer + /assess |
High-risk changes |
| quick | — | — | Local experiments only |
| Setting | Values | Default | What it does |
|---|---|---|---|
enforcement |
block / warn |
block |
Whether missing gates prevent or just warn on commit/push |
profile |
minimal / standard / strict / paranoid |
minimal |
Which skills are required at commit and push |
gate_mode |
legacy / signed |
legacy |
How gates are verified — signed uses JWT for teams/CI |
eval_threshold |
0–100 |
95 |
Minimum /evaluate score to pass the push gate |
Examples:
| Setting | Values | Default | What it does |
|---|---|---|---|
tdd |
true / false |
true |
Enable test-first workflow enforcement |
tdd_mode |
remind / strict |
remind |
remind = advisory nudge; strict = hard-blocks source edits until tests exist |
Examples:
// Nudge to write tests first but don't block (default)
"tdd": true, "tdd_mode": "remind"
// Hard-block: cannot edit src/ until a failing test exists
"tdd": true, "tdd_mode": "strict"
// Disable TDD enforcement entirely (not recommended)
"tdd": false| Setting | Values | Default | What it does |
|---|---|---|---|
gate_protect |
true / false |
true |
Block agent from writing .gates/ files directly |
report_protect |
true / false |
true |
Block agent from writing reports/ files directly |
mode |
normal / strict |
normal |
strict enables anti-fake drift detection on fixtures |
Examples:
// Default: agent cannot fake passing gates or reports
"gate_protect": true, "report_protect": true
// Strict anti-fake: detect drift in test fixtures and gate provenance
"mode": "strict"
// Disable protections (only for debugging the toolkit itself)
"gate_protect": false, "report_protect": false| Setting | Values | Default | What it does |
|---|---|---|---|
compact_at_minutes |
0+ |
70 |
Layer 1: write HANDOFF.md breadcrumb at this time; session continues |
max_session_minutes |
0+ |
200 |
Layer 2: hard stop — session ends with restart prompt |
continue |
true / false |
true |
Auto-restart session when context is exhausted (headless) |
skill_routing |
true / false |
true |
Auto-detect user intent and route to the matching skill |
auto |
true / false |
false |
Run skills in auto mode (no confirmation prompts) |
model |
auto / model name |
auto |
Override which model the agent uses |
Sessions use a two-layer limit system. Layer 1 (compact_at_minutes) writes HANDOFF.md as a breadcrumb so the agent can re-orient after compaction — the session keeps running. Layer 2 (max_session_minutes, or 1 compaction, or 700KB output) is a hard stop that writes HANDOFF.md with a restart prompt you can paste into a new session.
Examples:
// Two-layer defaults: breadcrumb at 70 min, hard stop at 200 min
"compact_at_minutes": 70, "max_session_minutes": 200
// Shorter sessions (e.g. for cost control)
"compact_at_minutes": 30, "max_session_minutes": 60
// Disable Layer 1 breadcrumb (only hard stop at 200 min)
"compact_at_minutes": 0
// Auto-restart when context runs out (headless mode)
"continue": true
// Keep session alive with warnings only (no restart)
"continue": false
// Disable skill routing (manual /skill invocation only)
"skill_routing": false
// Run skills without asking for confirmation
"auto": true
// Pin to a specific model
"model": "opus"| Setting | Values | Default | What it does |
|---|---|---|---|
test_command |
shell command | "python3 -m pytest tests/ -q" |
Command the toolkit runs to execute tests |
lint_command |
shell command | "python3 -m compileall -q ..." |
Command the toolkit runs to lint/check code |
Examples:
// Node.js project
"test_command": "npm test",
"lint_command": "npm run lint"
// Go project
"test_command": "go test ./...",
"lint_command": "golangci-lint run"
// Rust project
"test_command": "cargo test",
"lint_command": "cargo clippy"
// Python with coverage
"test_command": "python3 -m pytest tests/ --cov=src -q",
"lint_command": "ruff check ."{
"enforcement": "block",
"profile": "standard",
"gate_mode": "legacy",
"eval_threshold": 95,
"tdd": true,
"tdd_mode": "remind",
"gate_protect": true,
"report_protect": true,
"mode": "normal",
"continue": true,
"compact_at_minutes": 70,
"max_session_minutes": 200,
"skill_routing": true,
"auto": false,
"model": "auto",
"test_command": "npm test",
"lint_command": "npm run lint"
}→ Full reference: docs/configuration.md · Signed gates: shared/gate-unlock.md
| Doc | For |
|---|---|
| System overview | How skills, hooks, gates, and reports connect |
| Daily workflow | Commit, push, finalize, gate profiles |
| Install & updates | First setup, auto-sync, manual refresh |
| Other LLMs | Cursor, GPT, Gemini, Windsurf, Aider |
| Skills reference | All 13 skills |
| Configuration | gates.json, presets, signed mode |
| Gate unlock | Legacy vs signed, rare options |
| Troubleshooting | Common failures |
| Guardrails | All G-* rules |
| Architecture index | Design docs, requirements |
| Feature | Doc |
|---|---|
| Auto-continuation (long tasks) | agent-toolkit-continue (interactive) / claude-auto (headless) · architecture/auto-continuation.md |
| TDD strict mode | "tdd_mode": "strict" in gates.json — blocks source edits until tests exist |
| Strict mode (anti-fake) | shared/strict-mode.md |
| Signed gates (teams / CI) | shared/gate-unlock.md |
Auto mode (/skill auto) |
shared/orchestrator.md |
The skill tried to run finalize_report.py using a relative path from a different project directory. Ensure the skill SKILL.md files use the absolute path:
python3 /path/to/agent-toolkit/hooks/finalize_report.py <skill> .scratch/<skill>_<slug>/findings.jsonThe gate hook blocks commits when no gates.json is found, assuming the project is toolkit-managed. Two fixes:
- Upgrade the toolkit — the latest gate hook skips enforcement for repos without
gates.json - Temporary bypass — set the env var before your commit:
AGENT_TOOLKIT_ENFORCEMENT=warn git commit -m "your message"
The legacy fallback triggers when gates.json exists but has no commit_requires. Either:
- Add
"commit_requires": ["precommit"]togates.jsonand run/precommit - Or remove
gates.jsonto opt out of gating entirely
PRs welcome. Open an issue with battle-tested patterns or bugs you caught.
Licensed under the Apache License, Version 2.0. See LICENSE (SPDX: Apache-2.0).