Skip to content

Commit 622fbfa

Browse files
authored
feat(ce-ideate): improve for Fable model (#924)
1 parent 4719dc5 commit 622fbfa

7 files changed

Lines changed: 366 additions & 97 deletions

File tree

CONCEPTS.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,19 @@ A documented solution to a past problem — a bug fix, a convention, or a workfl
4646
### Pattern doc
4747
Guidance generalized from several Learnings into a broader rule. Higher-leverage than any single incident-level Learning, and higher-risk when stale, because future work treats it as broadly applicable.
4848

49+
## Skill orchestration
50+
51+
### Model tier
52+
A semantic cost class for a dispatched sub-agent — extraction (cheapest capable, for retrieval and quoting), generation (mid-tier, for evidence-driven work and mechanical verification), or ceiling (the orchestrator's own model, inherited by omitting any model selection) — declared once per Skill and referenced by tier name so model names never hardcode into skill content.
53+
54+
When a platform cannot select models per agent, every role runs on the inherited model and cost control falls back to structure: read budgets and output caps.
55+
56+
### Evidence dossier
57+
A bulk evidence artifact — verbatim quotes with source pointers, gathered by a cheap scout agent — written to scratch storage instead of returned inline, so the orchestrator carries only a short gist and downstream agents read the full dossier themselves.
58+
59+
### Load stub
60+
The inline remnant left in a Skill when load-bearing content moves to a reference file: a load instruction that names what the reference contains and the failure mode of skipping it, while keeping no detail an agent could improvise from — making the load structurally necessary rather than advisory.
61+
4962
## Review and workflow vocabulary
5063

5164
### Reviewer persona

docs/skills/ce-ideate.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,23 +58,23 @@ Asking an AI "what's worth exploring here?" usually returns:
5858

5959
### 1. Comprehensive grounding before any idea is generated
6060

61-
Every run starts with parallel grounding agents that supply the substance ideas will be qualified against — codebase scan (in repo mode), past institutional learnings from `docs/solutions/`, external prior art via web research, and optional Slack and issue intelligence when those tools are available. **External prior art is critical**: without it, the agent is just remixing what's already in your codebase or your head. With it, ideas can cite "this is how X solved this" — concrete, verifiable, named precedent.
61+
Every run starts with parallel grounding agents that supply the substance ideas will be qualified against — codebase scan (in repo mode), past institutional learnings from `docs/solutions/`, external prior art via web research, and optional Slack and issue intelligence when those tools are available. In repo mode, cheap **evidence scouts** then deepen the grounding: one per topic axis, each returning a dossier of verbatim quotes and `file:line` pointers, so ideation agents cite real code rather than a paraphrased summary. **External prior art is critical**: without it, the agent is just remixing what's already in your codebase or your head. With it, ideas can cite "this is how X solved this" — concrete, verifiable, named precedent.
6262

6363
### 2. Basis requirement — every idea cites its evidence
6464

6565
Each surviving candidate carries a tagged basis: `direct:` (quoted evidence), `external:` (named prior art), or `reasoned:` (written-out first-principles argument, not a gesture). Speculation that sounds plausible but has no basis is rejected. **Comprehensive grounding + basis requirement is the dual anti-slop mechanism.** One without the other is weaker: grounding without a basis gives well-informed speculation; a basis without grounding gives clever-sounding rationalization.
6666

6767
### 3. Six-frame divergent generation
6868

69-
Six parallel sub-agents, each biased toward a different generative frame: pain & friction, inversion/removal/automation, assumption-breaking, leverage & compounding, cross-domain analogy, and constraint-flipping. Single-prompt ideation collapses into the agent's most-trained directions — different frames force genuine breadth, especially cross-domain analogy and constraint-flipping which surface ideas no single prompt would.
69+
Parallel sub-agents cover six generative frames: pain & friction, inversion/removal/automation, assumption-breaking, leverage & compounding, cross-domain analogy, and constraint-flipping. Single-prompt ideation collapses into the agent's most-trained directions — different frames force genuine breadth, especially cross-domain analogy and constraint-flipping which surface ideas no single prompt would. The fleet is **cost-tiered**: evidence-driven frames run on a mid-tier model (the dossiers do the heavy lifting), while the ceiling frames — where the strong model's reasoning is the product — inherit the conversation's model. Say `go deep` to raise the whole fleet to the top tier.
7070

7171
### 4. Topic-surface decomposition — axis coverage as a dispatch invariant
7272

7373
Frames decide *how to think* about a topic; **axes** decide *what part of the topic to think on*. Before frame dispatch, the orchestrator decomposes the topic into 3-5 orthogonal axes derived from grounding (e.g., for "social sharing" — send, discovery, arrival, compounding, actor types). Each frame is then instructed to spread its ideas across axes, and an axis-coverage check after generation catches blind spots — if any axis has zero ideas, a bounded recovery dispatch fills it. The failure mode this prevents: six lenses converging on the most salient interpretation of a topic and missing the rest of its surface entirely. Atomic topics (a name, a tagline) and surprise-me runs skip decomposition cleanly.
7474

7575
### 5. Adversarial filtering with stated rejection reasons
7676

77-
The orchestrator critiques every candidate against a consistent rubric — groundedness, basis strength, expected value, novelty, pragmatism, leverage, implementation burden, overlap. One-line reasons accompany every rejection. Survivors are presented alongside a rejection summary so you see what was considered and cut.
77+
Critique runs in two layers. A **fresh-context verifier** — an agent that never saw the generation — tries to refute each candidate: do cited quotes actually exist, is the named prior art real, does the argument hold? Then the orchestrator arbitrates the final cut against a consistent rubric — groundedness, basis strength, expected value, novelty, pragmatism, leverage, implementation burden, overlap. One-line reasons accompany every rejection. Survivors are presented alongside a rejection summary so you see what was considered and cut.
7878

7979
### 6. Three modes — software, software-product, and entirely non-software
8080

@@ -92,9 +92,9 @@ Phrases like "what users are reporting" or "biggest issue patterns" trigger an i
9292

9393
## Quick Example
9494

95-
You invoke `ce-ideate "DX improvements"` from inside a code repo. The agent announces it'll dispatch ~9 grounding and ideation agents and offers skip phrases for cost control.
95+
You invoke `ce-ideate "DX improvements"` from inside a code repo. The agent announces it'll dispatch ~13 agents — most on cheap tiers — and offers skip phrases for cost control.
9696

97-
Grounding agents return in parallel — a codebase summary, relevant past learnings, external prior art on developer-experience patterns. The orchestrator decomposes the topic into 4-5 axes derived from that grounding (e.g., for "DX improvements" — feedback loops, environment friction, tooling ergonomics, knowledge accessibility, automation surface). Six ideation sub-agents then generate candidates from different frames, each tagged with the axis it targets. The orchestrator merges 40+ candidates into one list, synthesizes cross-cutting combinations, runs an axis-coverage check (any empty axis triggers one bounded recovery dispatch), and runs the adversarial critique pass — about 13 ideas are cut for being too vague, unjustified, or duplicative.
97+
Grounding agents return in parallel — a codebase summary, relevant past learnings, external prior art on developer-experience patterns. The orchestrator decomposes the topic into 4-5 axes derived from that grounding (e.g., for "DX improvements" — feedback loops, environment friction, tooling ergonomics, knowledge accessibility, automation surface), then cheap evidence scouts gather a quote-and-pointer dossier per axis. Five ideation sub-agents covering six frames generate candidates from that evidence, each idea tagged with the axis it targets and verified against the actual files before submission. The orchestrator merges 40+ candidates into one list, synthesizes cross-cutting combinations, runs an axis-coverage check (any empty axis triggers one bounded recovery dispatch), and runs the two-layer critique pass — a fresh-context verifier tries to refute each candidate, then the orchestrator makes the final cut. About 13 ideas are cut for being too vague, unjustified, refuted, or duplicative.
9898

9999
The full deliverable — all seven cards with basis, rationale, downsides, confidence, complexity, plus the rejection summary — is written automatically to a self-contained HTML file and opened in your browser; the session itself shows just a concise ranked summary and the path, so you read the rich version, not a wall of terminal text. Then a four-option next-steps menu: open it in the browser, brainstorm one idea with `ce-brainstorm`, iterate on one idea (adjust or ask, staying here), or done. (Markdown runs swap "open in browser" for "open and iterate in Proof".)
100100

@@ -175,6 +175,7 @@ The deliverable is written automatically — you don't have to ask. If a run was
175175
| `<path>` | a directory or file to focus on |
176176
| `<constraint>` | e.g., `low-complexity quick wins`, `polish-only` |
177177
| `surprise me` | Surprise-me mode |
178+
| `go deep` | Maximum depth: every ideation agent runs on the top-tier model, verification budgets double, and a second critic joins the filtering pass |
178179
| `top issue themes in <area>` | Triggers issue-tracker intent |
179180
| `output:md` | Write the artifact as markdown instead of the default self-contained HTML (`output:html` forces HTML explicitly). Also settable per-project via `ideate_output` in `.compound-engineering/config.local.yaml` |
180181

0 commit comments

Comments
 (0)