Skip to content

Commit 3e03365

Browse files
authored
feat(ce-work-beta): adaptive effort selection for Codex delegation batches (#759)
1 parent 5139ff1 commit 3e03365

2 files changed

Lines changed: 42 additions & 2 deletions

File tree

plugins/compound-engineering/skills/ce-work-beta/SKILL.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,8 @@ Store the resolved state for downstream consumption:
6565
- `sandbox_mode` -- `yolo` or `full-auto` (from config or default `yolo`)
6666
- `consent_granted` -- boolean (from config `work_delegate_consent`)
6767
- `delegate_model` -- string from config, or unset (defer to Codex config)
68-
- `delegate_effort` -- string from config, or unset (defer to Codex config)
68+
- `delegate_effort` -- string from config, or unset (defer to Codex config). Floor for per-batch effort selection; not passed directly to `codex exec`.
69+
- `effective_effort` -- per-batch derived value (`default | medium | high | xhigh`), computed before each batch from `delegate_effort` and the picked level per `references/codex-delegation-workflow.md` ("Per-Batch Effort"). Feeds the `codex exec` invocation in place of `delegate_effort`.
6970

7071
---
7172

plugins/compound-engineering/skills/ce-work-beta/references/codex-delegation-workflow.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,45 @@ On decline:
8888

8989
Delegate all units in one batch. If the plan exceeds 5 units, split into batches at the plan's own phase boundaries, or in groups of roughly 5 -- never splitting units that share files. Skip delegation entirely if every unit is trivial.
9090

91+
## Per-Batch Effort
92+
93+
Each batch picks an effort level proportional to its complexity, then resolves against the config floor before invocation.
94+
95+
**Effort levels — guidelines, not predicates**
96+
97+
Pick the level that best fits the batch. These are signals to weigh, not boxes to tick — use judgment.
98+
99+
- **default (no flag)** — trivial work with no behavioral change: a one-line config tweak, a rename, a typo or comment-only fix, a pure documentation update. Defers to the user's `~/.codex/config.toml` default (which is `medium` on a stock Codex install).
100+
- **`medium`** — small, well-scoped behavioral changes that stay clear of high-risk areas. A handful of files, a single concern, no novel architecture.
101+
- **`high`** — work that touches a high-risk area (auth/session logic, payments, database migrations, external API contracts, error handling with retries/fallbacks), or work spanning enough surface area that one mistake could cascade.
102+
- **`xhigh`** — architectural work: cross-cutting refactors, multiple high-risk areas in the same batch, changes that propagate broadly, or anywhere a wrong call meaningfully degrades the project.
103+
104+
When in doubt, lean up one level — under-resourcing risky work costs more than over-resourcing routine work. Briefly note the picked level and the signal that drove it (e.g., "`high` — touches db/migrations") so the choice is auditable.
105+
106+
A few edge cases worth handling explicitly:
107+
- **Test-only batches:** classify by what the tests *exercise*, not by file paths. Tests for auth flows, payment logic, or migrations get the same level the equivalent implementation work would get.
108+
- **Mixed-complexity batches:** the batch picks one level. If a single batch combines a typo unit and a payments rewrite, pick the higher level. If the spread feels wasteful, prefer splitting at the batching step (see Batching above) over averaging it out.
109+
- **Deletion-only batches:** classify by the risk of what is being removed, not by counts of remaining content. Removing an auth module is `high` even if the batch produces zero `Modify` content.
110+
- **Documentation- or comment-only batches:** `default`.
111+
112+
**Floor and resolution — hard rules**
113+
114+
Effort levels are ordered: `minimal < low < medium < high < xhigh`.
115+
116+
Compute `effective_effort`:
117+
118+
- If `delegate_effort` is unset: `effective_effort = picked_level`.
119+
- If `delegate_effort` is set: substitute `default``medium` in `picked_level`, then `effective_effort = max(picked_level, delegate_effort)`.
120+
121+
Emit based on `effective_effort`:
122+
123+
- `medium`, `high`, or `xhigh` → emit `-c 'model_reasoning_effort="<value>"'`.
124+
- `default` → omit the flag (defer to `~/.codex/config.toml`). Reachable only when `delegate_effort` is unset and the pick is `default`.
125+
126+
Never pass the literal string `"default"` to `codex exec`.
127+
128+
Store `effective_effort` as a per-batch derived state value (alongside the session-level `delegate_effort`) and use it in place of `delegate_effort` throughout the Execution Loop.
129+
91130
## Prompt Template
92131

93132
At the start of delegated execution, create a per-run OS-temp scratch directory via `mktemp -d` and capture its **absolute path** for all downstream use. All scratch files for this invocation live under that directory. Do not use `.context/` — these scratch files are per-run throwaway that get cleaned up when delegated execution ends (see Cleanup below), matching the repo Scratch Space convention for one-shot artifacts. Do not pass unresolved shell-variable strings to non-shell tools (Write, Read); use the absolute path returned by `mktemp -d`.
@@ -239,7 +278,7 @@ codex exec \
239278
**Conditional flags** — only include each line when the corresponding skill-state value is set:
240279

241280
- If `delegate_model` is set, insert ` -m "<delegate_model>" \` as a line before `$SANDBOX_FLAG`.
242-
- If `delegate_effort` is set, insert ` -c 'model_reasoning_effort="<delegate_effort>"' \` as a line before `$SANDBOX_FLAG`.
281+
- If `effective_effort` is `medium`, `high`, or `xhigh` (resolved via Per-Batch Effort above), insert ` -c 'model_reasoning_effort="<effective_effort>"' \` as a line before `$SANDBOX_FLAG`. When `effective_effort` is `default` (only possible when `delegate_effort` is unset and the pick is `default`), omit the line — never pass the literal string `"default"`.
243282

244283
When either value is unset, omit its line entirely — Codex resolves the default from the user's `~/.codex/config.toml` (and ultimately the CLI's own built-in default). Do not substitute a placeholder string for unset values.
245284

0 commit comments

Comments
 (0)