Skip to content

permoon/multi-agent-dev-loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

multi-agent-dev-loop

A standalone skill for running high-risk implementation work through a multi-agent development loop.

Workflow diagram

Instead of asking one agent to plan, code, deploy, and debug in a single fragile thread, this skill coordinates Claude + Codex + Gemini through a fixed workflow:

plan -> review -> code -> review -> deploy -> smoke test -> triage

The goal is not ceremony. The goal is to make autonomous development resumable, inspectable, and safer when the blast radius is real.

繁體中文

Why this exists

AI coding agents are good at moving fast, but risky work needs more than speed. When a task touches schema, IAM, data pipelines, deploy config, migrations, or multiple files, the hard part is usually not writing code. It is keeping the plan, assumptions, validation, rollback path, and post-deploy evidence aligned.

multi-agent-dev-loop gives the agent a repeatable operating model:

  • Write a concrete plan before coding
  • Challenge the plan and validation strategy with another agent
  • Produce a smoke test as part of implementation
  • Review deploy-sensitive changes before release
  • Route failures back to the correct step instead of guessing
  • Leave artifacts behind so humans can audit or resume the work

When to use it

Use this skill for non-trivial implementation work:

  • Multi-file features or refactors
  • Architecture decisions
  • Schema, IAM, or data changes
  • Deploys and migrations
  • Rollback risk
  • Large blast radius
  • Work that should be auditable or resumable

Skip it for:

  • Typos and single-line edits
  • Formatting-only changes
  • Pure exploration or Q&A
  • One-off scripts that will not be deployed
  • Explicit quick fixes

What it does

Step Owner Output
1. Plan Claude plans/<feature>/plan.md + validation.md
2. Plan review Codex plans/<feature>/review-codex.md
3. Revise + re-review Claude + Codex revised plan artifacts
3.5. Red team, conditional Claude + Codex + Gemini plans/<feature>/red-team.md
4. Implement Codex source files + scripts/smoke/<feature>.sh
5. Code review Claude inline fixes or review notes back to Codex
6. Deploy review, conditional GCP Gemini plans/<feature>/review-gemini.md
7. Verify + triage Claude runs/<timestamp>-<feature>/{smoke.log,triage.md}

If the smoke test fails, the skill classifies the failure and routes it:

Failure type Route
Deploy failed: service will not start, IAM denied, invalid config Step 6
Deploy succeeded but behavior is wrong Step 4
Behavior matches code but not the intended plan Step 1
Smoke test is a false negative Fix validation.md, then rerun from Step 4

Install

This repository is now a standalone skill. Copy the repository folder into the skills directory used by your agent environment.

For Codex:

mkdir -p ~/.codex/skills
cp -R /path/to/multi-agent-dev-loop ~/.codex/skills/

For Claude Code or other skill-compatible environments, copy this folder into that tool's configured skills directory.

The skill entrypoint is:

SKILL.md

Prerequisites

  • codex CLI installed and authenticated (codex exec --help works)
  • gemini CLI installed and authenticated for red-team and GCP deploy review
  • A working directory where the artifact tree can be created

Gemini is only needed for conditional red-team and GCP deploy-review steps.

Artifact tree

plans/<feature>/
  plan.md
  validation.md
  red-team.md
  review-codex.md
  review-gemini.md
deploy/<feature>/
scripts/smoke/<feature>.sh
runs/<timestamp>-<feature>/
  smoke.log
  triage.md

<feature> should be kebab-case, such as workflow-daily-ingest. <timestamp> uses YYYYMMDD-HHMMSS.

Output contract

After each step, the skill reports exactly three lines:

Step: <step just finished>
Artifact: <file path produced>
Next: <next step or escalation reason>

Large outputs go into files, not chat.

Example

User request:

Add a daily aggregate table analytics.daily_user_summary and a workflow to
refresh it at 6am.

The skill produces:

  • A concrete implementation plan and validation plan
  • Codex review notes challenging schema, IAM, deploy order, and smoke coverage
  • Implementation plus an idempotent smoke test
  • Claude code review
  • Optional Gemini deploy review if GCP risk is high
  • Smoke-test output and triage if verification fails

License

MIT

About

Multi-agent collaboration loop (Claude + Codex + Gemini) for non-trivial implementation work — 7-step plan/review/code/deploy/verify workflow with fixed artifacts and explicit failure routing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors