Agent Toolkit

Skills, guardrails, and structural hooks for AI coding agents. Plan, build, test, debug, and ship — any repo, any language.

Best on Claude Code — hooks enforce rules the model cannot bypass.
Also works on Cursor, Codex, Gemini, Windsurf, Aider — via project rules (setup guide).

What this is

Piece	Purpose
Skills	Step-by-step workflows — `/explore`, `/implementation`, `/precommit`, …
Guardrails	Safety and quality rules (`shared/guardrails.md`)
Hooks	Structural enforcement on Claude Code — block bad writes, gate commits, route skills

Prompt rules can be ignored. Hooks cannot. On other LLMs you get skills + guardrails via AGENTS.md; you enforce gates manually.

→ System overview · Architecture docs

Quick start

git clone https://github.com/jvalin17/agent-toolkit.git
cd agent-toolkit && ./install.sh          # once — needs python3, jq, Claude Code

cd /path/to/your-project && claude        # hooks inject context; look for "AGENT TOOLKIT ACTIVE"
/explore .                                # understand the codebase
/precommit                                # before commit (default gate)

Natural language works: "fix the login bug" routes to /debug. Chain hands-off: /requirements auto my-app.

Auto-continuation — sessions use a two-layer limit: at 70 min a breadcrumb is saved to HANDOFF.md (session continues); at 200 min (or first compaction) a hard stop fires with a restart prompt. Two ways to run long tasks:

agent-toolkit-continue "Build auth system"   # interactive — restarts in same terminal
claude-auto "Build auth system"              # headless — for CI/background tasks

Or set "continue": true in gates.json for in-hook restart (headless only). Set "continue": false to disable (session will warn but keep running).

Install details & updates: docs/install-and-updates.md

Daily workflow

When	Do this
Building	`/explore` or `/requirements` → `/implementation`
Committing	`/precommit` → write findings → `finalize_report.py` → `git commit`
Pushing (guarded)	`/evaluate` → finalize → `git push`

python3 hooks/finalize_report.py precommit .scratch/precommit_<slug>/findings.json

With defaults, only the hook writes reports/ and .gates/ — the agent cannot fake gate files.

→ Full commit/push flows: docs/workflow.md · Gate profiles: shared/gate-unlock.md

Skills

Common
`/explore`	Understand existing code
`/requirements`	Gather requirements
`/implementation`	Build with TDD
`/precommit`	Quality gate before commit
`/debug`	Hypothesis-driven debugging
`/evaluate`	Quality score (push gate)

All 13 skills: docs/skills.md

Configuration

All settings live in gates.json at your project root. Use presets or edit directly.

agent-toolkit-setup --status      # show current config
agent-toolkit-setup --balanced    # daily dev (default)
agent-toolkit-setup --guarded     # production
agent-toolkit-setup --lockdown    # strict + all reviews
agent-toolkit-setup --tdd off     # toggle one setting

Presets

Preset	Commit requires	Push requires	Use when
balanced (default)	`/precommit`	—	Daily development
guarded	`/precommit`	`/evaluate`	Production branches
lockdown	`/precommit` + `/evaluate`	`/evaluate` + `/reviewer` + `/assess`	High-risk changes
quick	—	—	Local experiments only

All settings

Gate enforcement

Setting	Values	Default	What it does
`enforcement`	`block` / `warn`	`block`	Whether missing gates prevent or just warn on commit/push
`profile`	`minimal` / `standard` / `strict` / `paranoid`	`minimal`	Which skills are required at commit and push
`gate_mode`	`legacy` / `signed`	`legacy`	How gates are verified — `signed` uses JWT for teams/CI
`eval_threshold`	`0`–`100`	`95`	Minimum `/evaluate` score to pass the push gate

Examples:

// Block commits that skip /precommit (default behavior)
"enforcement": "block"

// Just warn (useful when rolling out gates on an existing project)
"enforcement": "warn"

// Require /evaluate before push (production branches)
"profile": "standard"

// Require /evaluate + /reviewer + /assess before push (high-risk)
"profile": "paranoid"

// Use JWT-signed gates (team repos with branch protection)
"gate_mode": "signed"

// Lower the bar for evaluate score (e.g. early prototypes)
"eval_threshold": 80

TDD & quality

Setting	Values	Default	What it does
`tdd`	`true` / `false`	`true`	Enable test-first workflow enforcement
`tdd_mode`	`remind` / `strict`	`remind`	`remind` = advisory nudge; `strict` = hard-blocks source edits until tests exist

Examples:

// Nudge to write tests first but don't block (default)
"tdd": true, "tdd_mode": "remind"

// Hard-block: cannot edit src/ until a failing test exists
"tdd": true, "tdd_mode": "strict"

// Disable TDD enforcement entirely (not recommended)
"tdd": false

Security & anti-fake

Setting	Values	Default	What it does
`gate_protect`	`true` / `false`	`true`	Block agent from writing `.gates/` files directly
`report_protect`	`true` / `false`	`true`	Block agent from writing `reports/` files directly
`mode`	`normal` / `strict`	`normal`	`strict` enables anti-fake drift detection on fixtures

Examples:

// Default: agent cannot fake passing gates or reports
"gate_protect": true, "report_protect": true

// Strict anti-fake: detect drift in test fixtures and gate provenance
"mode": "strict"

// Disable protections (only for debugging the toolkit itself)
"gate_protect": false, "report_protect": false

Session behavior

Setting	Values	Default	What it does
`compact_at_minutes`	`0`+	`70`	Layer 1: write HANDOFF.md breadcrumb at this time; session continues
`max_session_minutes`	`0`+	`200`	Layer 2: hard stop — session ends with restart prompt
`continue`	`true` / `false`	`true`	Auto-restart session when context is exhausted (headless)
`skill_routing`	`true` / `false`	`true`	Auto-detect user intent and route to the matching skill
`auto`	`true` / `false`	`false`	Run skills in auto mode (no confirmation prompts)
`model`	`auto` / model name	`auto`	Override which model the agent uses

Sessions use a two-layer limit system. Layer 1 (compact_at_minutes) writes HANDOFF.md as a breadcrumb so the agent can re-orient after compaction — the session keeps running. Layer 2 (max_session_minutes, or 1 compaction, or 700KB output) is a hard stop that writes HANDOFF.md with a restart prompt you can paste into a new session.

Examples:

// Two-layer defaults: breadcrumb at 70 min, hard stop at 200 min
"compact_at_minutes": 70, "max_session_minutes": 200

// Shorter sessions (e.g. for cost control)
"compact_at_minutes": 30, "max_session_minutes": 60

// Disable Layer 1 breadcrumb (only hard stop at 200 min)
"compact_at_minutes": 0

// Auto-restart when context runs out (headless mode)
"continue": true

// Keep session alive with warnings only (no restart)
"continue": false

// Disable skill routing (manual /skill invocation only)
"skill_routing": false

// Run skills without asking for confirmation
"auto": true

// Pin to a specific model
"model": "opus"

Project commands

Setting	Values	Default	What it does
`test_command`	shell command	`"python3 -m pytest tests/ -q"`	Command the toolkit runs to execute tests
`lint_command`	shell command	`"python3 -m compileall -q ..."`	Command the toolkit runs to lint/check code

Examples:

// Node.js project
"test_command": "npm test",
"lint_command": "npm run lint"

// Go project
"test_command": "go test ./...",
"lint_command": "golangci-lint run"

// Rust project
"test_command": "cargo test",
"lint_command": "cargo clippy"

// Python with coverage
"test_command": "python3 -m pytest tests/ --cov=src -q",
"lint_command": "ruff check ."

Example: full config

{
  "enforcement": "block",
  "profile": "standard",
  "gate_mode": "legacy",
  "eval_threshold": 95,
  "tdd": true,
  "tdd_mode": "remind",
  "gate_protect": true,
  "report_protect": true,
  "mode": "normal",
  "continue": true,
  "compact_at_minutes": 70,
  "max_session_minutes": 200,
  "skill_routing": true,
  "auto": false,
  "model": "auto",
  "test_command": "npm test",
  "lint_command": "npm run lint"
}

→ Full reference: docs/configuration.md · Signed gates: shared/gate-unlock.md

Documentation

Doc	For
System overview	How skills, hooks, gates, and reports connect
Daily workflow	Commit, push, finalize, gate profiles
Install & updates	First setup, auto-sync, manual refresh
Other LLMs	Cursor, GPT, Gemini, Windsurf, Aider
Skills reference	All 13 skills
Configuration	`gates.json`, presets, signed mode
Gate unlock	Legacy vs signed, rare options
Troubleshooting	Common failures
Guardrails	All G-* rules
Architecture index	Design docs, requirements

Advanced

Feature	Doc
Auto-continuation (long tasks)	`agent-toolkit-continue` (interactive) / `claude-auto` (headless) · architecture/auto-continuation.md
TDD strict mode	`"tdd_mode": "strict"` in `gates.json` — blocks source edits until tests exist
Strict mode (anti-fake)	shared/strict-mode.md
Signed gates (teams / CI)	shared/gate-unlock.md
Auto mode (`/skill auto`)	shared/orchestrator.md

Troubleshooting

`finalize_report.py: No such file or directory`

The skill tried to run finalize_report.py using a relative path from a different project directory. Ensure the skill SKILL.md files use the absolute path:

python3 /path/to/agent-toolkit/hooks/finalize_report.py <skill> .scratch/<skill>_<slug>/findings.json

`BLOCKED: git commit requires precommit skill`

The gate hook blocks commits when no gates.json is found, assuming the project is toolkit-managed. Two fixes:

Upgrade the toolkit — the latest gate hook skips enforcement for repos without gates.json

Temporary bypass — set the env var before your commit:

AGENT_TOOLKIT_ENFORCEMENT=warn git commit -m "your message"

`Run install.sh in project root`

The legacy fallback triggers when gates.json exists but has no commit_requires. Either:

Add "commit_requires": ["precommit"] to gates.json and run /precommit
Or remove gates.json to opt out of gating entirely

Contributing

PRs welcome. Open an issue with battle-tested patterns or bugs you caught.

License

Licensed under the Apache License, Version 2.0. See LICENSE (SPDX: Apache-2.0).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Toolkit

What this is

Quick start

Daily workflow

Skills

Configuration

Presets

All settings

Gate enforcement

TDD & quality

Security & anti-fake

Session behavior

Project commands

Example: full config

Documentation

Advanced

Troubleshooting

`finalize_report.py: No such file or directory`

`BLOCKED: git commit requires precommit skill`

`Run install.sh in project root`

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.github/workflows		.github/workflows
agents		agents
architecture		architecture
docs		docs
gate		gate
hooks		hooks
reports		reports
requirements		requirements
scripts		scripts
shared		shared
skills		skills
templates		templates
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gates.json		gates.json
generate-project-rules.sh		generate-project-rules.sh
install.sh		install.sh
pytest.ini		pytest.ini
update.sh		update.sh

Folders and files

Latest commit

History

Repository files navigation

Agent Toolkit

What this is

Quick start

Daily workflow

Skills

Configuration

Presets

All settings

Gate enforcement

TDD & quality

Security & anti-fake

Session behavior

Project commands

Example: full config

Documentation

Advanced

Troubleshooting

finalize_report.py: No such file or directory

BLOCKED: git commit requires precommit skill

Run install.sh in project root

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`finalize_report.py: No such file or directory`

`BLOCKED: git commit requires precommit skill`

`Run install.sh in project root`

Packages