Cross-platform skill compatibility: agent-neutral prose, source-verified per-runtime tool refs#1486
Open
obra wants to merge 5 commits into
Open
Cross-platform skill compatibility: agent-neutral prose, source-verified per-runtime tool refs#1486obra wants to merge 5 commits into
obra wants to merge 5 commits into
Conversation
added 5 commits
May 5, 2026 18:25
Replace generic third-person "Claude" with "agents" / "your agent" forms across active skill prose, the README intro, and the vendored anthropic-best-practices.md reference. Carve-outs preserved: historical attribution paths, the "Variant C: Claude.AI Emphatic Style" example label, model identifiers (Haiku/Sonnet/Opus), and the "In Claude Code:" per-platform skill-dispatch list. Coined-term rename: "Claude Search Optimization (CSO)" → "Skill Discovery Optimization (SDO)" in writing-skills/SKILL.md. Files in this commit also pick up later-phase changes that accumulated on the same files (dispatching-parallel-agents code- example transformation, writing-skills numbering and path fixes). The bundled spec at docs/superpowers/specs/ records the original scope and the carve-outs. README.md gets only its prose change here; the alphabetization lands in Phase C's commit.
Two structural changes:
1. Generalize CLAUDE.md-specific guidance:
- "Project-specific conventions (put in CLAUDE.md)" → "(put in
your instructions file)" in writing-skills/SKILL.md
- "(explicit CLAUDE.md violation)" → "(explicit instruction-file
violation)" in receiving-code-review/SKILL.md
- The instruction-priority list in using-superpowers/SKILL.md
stays inclusive (CLAUDE.md, GEMINI.md, AGENTS.md) — that's
load-bearing, not a substitution opportunity.
2. Per-platform tool reference files at skills/using-superpowers/
references/{claude-code,codex,copilot,gemini}-tools.md. Each ref
documents:
- The runtime's preferred instructions file (CLAUDE.md, AGENTS.md,
GEMINI.md, etc.) and how it loads
- The runtime's personal-skills directory + cross-runtime
~/.agents/skills/ path where applicable
- Action-language → tool-name mapping table
Tool names and table content reflect the source-verified state from
direct inspection of openai/codex, google-gemini/gemini-cli,
sst/opencode, and the installed @github/copilot package. Filenames
and behaviors are sourced from each runtime's official docs.
Files in this commit also pick up later-phase changes that
accumulated on the same files (using-superpowers/SKILL.md "How to
Access Skills" overhaul, action-language flowchart, refs' final
table content). The bundled spec records original scope.
Quickstart link list and the per-harness install sub-sections both reorder to strict alphabetical: Claude Code, Codex App, Codex CLI, Cursor, Factory Droid, Gemini CLI, GitHub Copilot CLI, OpenCode Three blocks moved (Codex App swaps with Codex CLI; Cursor moves up two slots; GitHub Copilot CLI moves up one). Claude Code stays first by alphabetical chance. Each install sub-section's content is byte-identical pre/post — only the positions change. Quickstart anchors verified against the new heading order.
Misc platform/runtime statements and adjacencies that don't fit the prose, config-ref, README-ordering, or tool-vocabulary buckets: - visual-companion frame template: rename CSS/HTML id #claude-content → #frame-content. The id is purely styling — nothing external references it. The brainstorm-server test that asserted the old string is updated in lockstep. - visual-companion launch instructions: add a Copilot CLI section alongside Claude Code, Codex, and Gemini CLI; combine the Claude Code (macOS / Linux) and (Windows) sections so heading style matches the other (non-OS-qualified) platforms. - visual-companion: "Use Write tool" → "Use your file-creation tool" for the cat/heredoc warning. The prohibition is what's load- bearing, not the tool name. - executing-plans/SKILL.md: list all subagent-capable runtimes (Claude Code, Codex CLI, Codex App, Copilot CLI, Gemini CLI) and point at the per-platform tool refs as the source of truth. - executing-plans/SKILL.md: relative path "using-superpowers/ references/" → "../using-superpowers/references/" to resolve correctly from the executing-plans/ directory. No bundled spec doc here — Phase D was scope-extension work that took place across rounds, with no standalone spec authored.
Replace Claude-Code-specific tool names in skill prose, prompt
templates, and OpenCode-facing docs with action-language descriptions
that resolve to each runtime's native tool via the per-platform refs.
Changes by category:
- Prose mentions ("Use TodoWrite to track...", "Use Task tool with
general-purpose type") → action language ("Track each item as a
todo", "Dispatch a general-purpose subagent")
- Prompt template headers (6 files): "Task tool (general-purpose):"
→ "Subagent (general-purpose):" — preserves the type information
without naming Claude Code's specific dispatch tool
- DOT flowchart node labels: "Invoke Skill tool" → "Invoke the
skill"; "Create TodoWrite todo per item" → "Create a todo per
item"
- OpenCode INSTALL.md and docs/README.opencode.md: replace the old
"TodoWrite → todowrite, Task → @mention" mapping (which both
taught a vocabulary skills no longer use AND was wrong about
@mention being a real OpenCode syntax) with an action-language
mapping verified against the installed OpenCode CLI's tool
inventory.
The platform-tools refs landed in Phase B already document each
runtime's resolution; skills now speak in the actions those refs
map. Tool names that genuinely belong only in the per-platform
dispatch section ("In Claude Code: Use the `Skill` tool") and the
Claude-Code-specific Bash run_in_background flag note in
visual-companion remain — those are intentional carve-outs.
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem are you trying to solve?
Superpowers ships to Claude Code, Codex (CLI + App), Copilot CLI, Cursor, Factory Droid, Gemini CLI, and OpenCode, but the skill content was written first for Claude Code and bakes in Claude-Code-specific names, paths, and tool vocabulary. The motivating concrete failure was openai/plugins#217 — OpenAI's vendored fork of Superpowers attempted a wholesale Claude→Codex find-and-replace and got most of it wrong (rewrote historical attribution paths, replaced model names, broke install instructions). Their PR landed because nobody upstream had separated which references were actually Claude-Code-specific from which were generic prose that just happened to say "Claude".
Symptoms users hit on non–Claude-Code runtimes:
subagent-driven-developmentship prompt templates with aTask tool (general-purpose):header that doesn't match what Codex / Copilot CLI / Gemini CLI expose. Users either ignore the template or attempt to call a tool that doesn't exist on their runtime.using-superpowers/SKILL.mdtold users to "seereferences/copilot-tools.md(Copilot CLI),references/codex-tools.md(Codex)" but the priority list of instruction files was Claude-Code-first (CLAUDE.md, GEMINI.md, AGENTS.md), and the per-runtime instruction files weren't documented for each platform.gemini-tools.mdmapped subagent dispatch to@agent-name. The actual Gemini CLI tool isinvoke_agent(perAGENT_TOOL_NAME = 'invoke_agent'intool-names.ts);@generalistis chat sugar that triggersinvoke_agent. Same class of error incopilot-tools.md(claimedcreate/edittools that don't exist; Copilot usesapply_patchandrg, notcreate/editandgrep) and inOpenCode INSTALL.md(claimedTask → @mention syntax; OpenCode's real subagent dispatch tool istaskwith asubagent_typeparameter, no @mention syntax).The work happened in five phases driven by clarifying questions through
superpowers:brainstorming. Each phase scope was approved before implementation, then validated against authoritative source before the next phase.What does this PR change?
Five logically-distinct changes, one commit each:
skills/using-superpowers/references/{claude-code,codex,copilot,gemini}-tools.md.#claude-content→#frame-contentin the visual-companion frame template, fix the test that asserted the old name; add Copilot CLI launch instructions; consolidate Claude Code launch sections; resolve relative paths between sibling skills; expand the subagent-capable runtimes list).Task,TodoWrite,Skill,Read,Write,Bash, etc.) with action-language descriptions in skill prose, prompt templates (Subagent (general-purpose):instead ofTask tool (general-purpose):), and OpenCode docs. Per-runtime tool refs unified around the same vocabulary, with right-column tool names verified against each runtime's source.Is this change appropriate for the core library?
Yes. Cross-platform support is core to Superpowers — the project explicitly targets seven harnesses, and the README's first install section is no longer Claude Code by accident; it's first by alphabet. Every change here is a generalization that reduces a Claude-Code-specific assumption. None of it adds project-specific behavior, third-party integrations, or new dependencies. The per-runtime instruction-file conventions, personal-skills-directory paths, and tool mappings are factual reference material that benefits any agent reading a Superpowers skill on any supported runtime.
What alternatives did you consider?
git blameuseful.For each multi-phase file, I considered file-level granularity (commit each file in its primary phase with full final content) vs. hunk-level (split changes across phase commits). Chose file-level for most, with
README.mdsplit (Phase A: prose change; Phase C: alphabetization) because that file's two concerns are genuinely independent and the Phase A line was the only single-line touch.Does this PR contain multiple unrelated changes?
No — every change is "make a Superpowers skill / doc / reference work correctly across the supported runtimes". Multiple files and multiple phases, but one coherent direction. The five phase commits exist because the changes are reviewable in isolation, not because they're independent — applying any subset would be incomplete.
Existing PRs
No upstream PRs in
obra/superpowerscover this scope. The closest neighbours are individual harness-support PRs (e.g.,codex/openclaw-native-plugin,claude/review-opencode-*); each adds support for one runtime. This PR doesn't add a runtime — it normalizes content that already mentions all of them.Environment tested
Plus drill (the in-repo skill compliance harness) at
../drill, which orchestrates real tmux sessions and runs an LLM verifier against transcripts.Evaluation
Initial prompt (paraphrased; the actual session was iterative):
Validation runs: the change went through six rounds of adversarial code review (two parallel reviewers per round, each with a "5 points to whoever finds the most legitimate issues" framing). Each round surfaced findings that were verified against source before fixing — three rounds caught false positives that earlier rounds had merged on insufficient evidence (e.g., "TaskCreate is fabricated" — actually real, verified via ToolSearch in the live session; ".claude/rules/ is a Cursor feature" — actually a real Claude Code feature per code.claude.com/docs/en/memory; "Codex has a read_file tool" — actually false, verified by grepping the cloned
codex-rs/tools/src/).Drill runs: ran 17 scenarios from
../drillcovering skill triggering, SDD with the new prompt-template format, code review, spec review, Codex tool-mapping comprehension, Gemini tool-mapping comprehension, and worktree creation/detection. Final tally: 15/17 PASS. Two explained failures:claim-without-verification-naive(4 pass / 1 fail) — pre-existing PRI-1258 verification-reflex flakiness; the test's hard assertion onverification-before-completionskill loading beforegit commitdid not fire. This scenario also failed at the same skill assertion before my changes; not a regression introduced here.gemini-subagent-tool-mapping-comprehension(7 pass / 1 fail) — improved from 3p/5f in the prior baseline. The Gemini agent answeredtask_dispatch: "invoke_agent"(the source-verified correct answer); all hard regex assertions passed; one LLM judge criterion still marked fail despite the criterion text acceptinginvoke_agentas a valid form. Looks like judge variance, not an agent or doc problem. Drill scenario fix included as a separate commit in the drill repo (013fcb8 gemini-subagent scenario: accept invoke_agent / generalist as correct) — that's the one that took the 8-criterion fail count from 5 to 1.Direct empirical verification of the most-contested claim: the "Gemini's subagent dispatch tool is
invoke_agent, not@agent-nameor@generalist" assertion was verified in three independent ways:AGENT_TOOL_NAME = 'invoke_agent'inpackages/core/src/tools/tool-names.ts.docs/core/subagents.md"Forcing a subagent (@ syntax)" —@generalistinjects a system note that triggersinvoke_agent.gemini --yolothen@generalist Calculate 7 times 8— status line:≡ Agent Completed ✓ generalist · Completed successfully, output56. The tool exposed in the dispatch flow isinvoke_agent;@generalistis the chat shortcut that calls it.Source verifications performed:
openai/codexand greppedcodex-rs/tools/src/for the canonical Codex tool list (noread_file,view, orweb_fetchexist;web_searchdoes and is enabled by default).google-gemini/gemini-cliand readpackages/core/src/tools/tool-names.ts,packages/core/src/agents/agent-tool.ts, anddocs/core/subagents.md.sst/opencodeand inspectedpackages/opencode/src/tool/(confirmed tool ids:bash,read,write,edit,grep,glob,task,todowrite,webfetch,websearch,skill,apply_patch,lsp,plan).@github/copilotpackage at/opt/homebrew/lib/node_modules/@github/copilot/(nostop_agentexists; onlyread_agent,list_agents,write_agent.web_searchexists.bashasync mode ismode: "async"withdetach: true, notasync: trueparameter).ToolSearchin the active Claude Code session to load schemas forTaskCreate/TaskUpdate/TaskList/TaskGet/TaskOutput/TaskStopand confirmed which are todo-tracking and which are background-process lifecycle.*-promptqueries against installed Codex / Copilot CLI / Gemini CLI / OpenCode for tool inventories. Treated as confirming evidence only — the lesson from this PR is that live agent answers can hallucinate or omit, so source is authoritative when they conflict.Rigor
superpowers:writing-skills(read but not invoked end-to-end; the changes are content normalization, not new skill authoring) and completed adversarial pressure testing (six rounds of parallel reviewers, results in the session transcript).using-superpowers/SKILL.mdis unchanged. The "human partner" terminology is unchanged everywhere it appears. The only behavior-shaping change is the action-language tool vocabulary (Phase E), which is validated by the SDD drill scenarios continuing to pass with the newSubagent (general-purpose):prompt-template format.Carve-outs that were preserved deliberately (these are documented in the design specs at
docs/superpowers/specs/2026-05-05-platform-neutral-*-design.md):using-superpowers/SKILL.md— that section's whole point is to enumerate runtime-specific tool names; generalizing it would defeat its purpose.visual-companion.md'srun_in_backgroundBash flag note — that's a real Claude-Code-specific Bash tool parameter, not generic prose./Users/jesse/.claude/CLAUDE.mdinCREATION-LOG.md); model names (Claude Haiku/Sonnet/Opus); URLs (claude.com,claude.ai); the vendored Anthropic doc's filename (anthropic-best-practices.md); theClaude.AI Emphatic Styleexample label inCLAUDE_MD_TESTING.md. These are facts or product names, not platform-leak.Human review
@arittr — your review please. Particular attention requested on:
Task tool (general-purpose):→Subagent (general-purpose):) is the highest-leverage behavior change. Drill'ssdd-rejects-extra-features(7/7 pass) andcode-review-catches-planted-bugs(5/5 pass) confirm Claude Code still dispatches correctly with the new format, but if you spot a runtime where it could regress, flag it.docs/README.opencode.md,.opencode/INSTALL.md) — the old mapping table claimedTask → @mention syntaxwhich is just wrong; the new mapping is verified against the installed OpenCode CLI's tool inventory. Worth a sanity-check that the new examples are installable as written.The companion drill-scenario fix is in the drill repo at
013fcb8(already pushed to drill's main).