Some jobs are not a straight line (parse → layout → code). The model may need to decide what to do next: “read this file,” “run grep,” “then edit.” That is multi-step: each step is small, but the sequence is chosen dynamically within limits you set.
Neighbors: Modular prompt architecture · Prompts overview · Chapter 03 — Workflow · Chapter 04 — Agent design
| Pattern | When to use | Risk |
|---|---|---|
| Fixed pipeline | Figma → IR → layout → map → codegen (predictable SLAs) | Less flexible for odd files |
| Planner + executor | Unknown repo layout, exploratory refactors, “fix until green” | Higher cost, needs hard caps |
The planner LLM (or a smaller model) emits PlanStep[] JSON—never free prose:
{
"schemaVersion": 1,
"steps": [
{ "id": "s1", "kind": "tool", "tool": "read_file", "args": { "path": "src/App.tsx" } },
{ "id": "s2", "kind": "llm_subtask", "agent": "codegen", "inputRef": "s1" }
]
}Executor loop (pseudocode shape):
plan = planner.run(context)
for step in plan.steps within max_steps:
if step.kind == tool: run allowlisted tool; append observation
if step.kind == llm_subtask: call specialist with narrowed context
if verifier.fail: optionally replan with error digest (bounded)
max_steps,max_tool_calls,max_wall_ms,max_usdper job.- Tool allowlist only (
read_file,grep,apply_patch…)—no arbitrary shell from planner output. - Re-plan budget: at most
Knew plans after failure (e.g.K=2), then escalate to human. - State: persist
{ messages[], toolResults[], planId }so workers are idempotent on retry.
Keep the Figma core as a fixed pipeline for quality. Use planner+tools only inside bounded islands—for example “repair until tests pass” after codegen, or “discover DS import paths” on unfamiliar repos.
flowchart TB
ctx[Job_context_IR_errors]
pl[Planner_LLM_PlanStep_JSON]
ex[Executor_loop]
tools[Allowlisted_tools]
sub[Specialist_LLM_steps]
ver[Verifier_tests_lint]
replan{fail_and_replan_budget_left}
ctx --> pl
pl --> ex
ex --> tools
ex --> sub
tools --> ver
sub --> ver
ver -->|pass| done([Done])
ver -->|fail| replan
replan -->|yes| pl
replan -->|no| esc([Human_escalation])
After sandbox fails with Cannot find module './Hero.module.css', the planner emits: read_file Hero.tsx → codegen subtask “add missing CSS module with classes used in TSX.” Executor runs tools, then calls codegen with only those observations—full repo never enters context.
- Runaway tool loops (“read entire tree”): cap depth and file count per tool batch.
- Planner hallucinating tools: schema-validate
PlanStepand reject unknowntoolnames. - Mixing planner freedom with Figma determinism: keep Figma IR extraction out of the planner; planner operates on workspace + errors.
- Log per-step latency and tokens; attribute cost to
planId. - Prefer deterministic preprocessing before offering the planner a job (smaller plan, fewer surprises).
A planner is not “smarter”; it is another model with more degrees of freedom. You only win if tools + verifiers are strict—otherwise you traded a straight pipeline for an expensive random walk.