You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Create new agent skills following the Agent Skills open standard (agentskills.io). Interviews the user relentlessly about intent, scope, and edge cases before drafting. Covers SKILL.md structure, frontmatter, progressive disclosure, description optimization, script bundling, and review. Use when the user wants to create a skill, write a skill, build a new skill, make a skill, draft a SKILL.md, or mentions "create-skill". Also use when asked to package expertise, workflows, or domain knowledge into a reusable skill.
3
+
description: Create new agent skills following the Agent Skills open standard (agentskills.io). Interviews the user relentlessly about intent, scope, and edge cases before drafting. Covers SKILL.md structure, frontmatter, progressive disclosure, description optimization, script bundling, sub-command architecture, setup gates, context systems, and review. Use when the user wants to create a skill, write a skill, build a new skill, make a skill, draft a SKILL.md, or mentions "create-skill". Also use when asked to package expertise, workflows, or domain knowledge into a reusable skill.
4
4
---
5
5
6
6
# Create Skill
@@ -9,21 +9,32 @@ Create agent skills following the [Agent Skills open standard](https://agentskil
9
9
10
10
## Phase 1: Interview
11
11
12
-
Interview the user relentlessly about every aspect of this skill until reaching shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
12
+
Interview the user about every aspect of this skill until reaching shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
13
13
14
-
Ask questions one at a time. If a question can be answered by exploring the codebase, explore the codebase instead.
14
+
### Interview cadence
15
15
16
-
Cover these areas before writing anything:
16
+
Ask **one question at a time**. Wait for the answer before asking the next. Adapt follow-ups based on what you learn. Each question should provide clear benefit toward building a better skill — cut questions the codebase can answer for you.
17
17
18
-
-**What task does this skill cover?** What specific problem does it solve? What does the user do today without it?
19
-
-**Scope boundaries.** What should this skill NOT do? What adjacent tasks should be left to other skills or the agent's general capabilities?
20
-
-**Input/output.** What does the user provide? What does the skill produce? Are there specific formats?
21
-
-**Edge cases.** What goes wrong? What are the common mistakes? What gotchas would a new user hit?
22
-
-**Success criteria.** How do you know the skill worked correctly?
23
-
-**What can be scripted?** Actively look for operations that can be deterministic code rather than LLM instructions. Scripts are cheaper, faster, and more reliable. The more of a skill that runs as scripts, the less compute it burns. Only leave judgment calls and creative reasoning to instructions.
24
-
-**References needed?** Is there domain knowledge too large for the main SKILL.md that should live in separate files?
25
-
-**Existing patterns.** Are there similar skills or workflows to draw from? Check the codebase for conventions.
26
-
-**Platform constraints.** Will this skill run on macOS, Windows, and Linux? Scripts must handle path separators, temp directories, and shell differences across platforms.
18
+
If a question can be answered by exploring the codebase, explore the codebase instead of asking.
19
+
20
+
Focus areas, roughly in order:
21
+
22
+
1.**Purpose and audience.** What task does this skill cover? What specific problem does it solve? What does the user do today without it?
23
+
2.**Scope boundaries.** What should this skill NOT do? What adjacent tasks belong to other skills?
24
+
3.**Input/output.** What does the user provide? What does the skill produce? Specific formats?
25
+
4.**Edge cases.** What goes wrong? Common mistakes? Gotchas for new users?
26
+
5.**Success criteria.** How do you know the skill worked correctly?
27
+
6.**What can be scripted?** Look for deterministic operations that should be code, not LLM instructions. Scripts are cheaper, faster, and more reliable.
28
+
7.**References needed?** Domain knowledge too large for SKILL.md that should live in separate files?
29
+
8.**Existing patterns.** Similar skills or workflows to draw from? Check the codebase.
30
+
9.**Platform constraints.** macOS, Windows, and Linux? Scripts must handle path separators, temp directories, and shell differences.
31
+
10.**Architecture.** Based on the answers above, decide whether the skill's scope warrants sub-commands, context systems, or setup gates. Most skills don't need this — skip to Phase 2 if the skill is straightforward. If it does, ask:
32
+
-**Should this be one skill with sub-commands or multiple skills?** One skill with a router table prevents menu pollution. Multiple skills are appropriate when the tasks have no shared context or setup.
33
+
-**Does the skill need project-level context?** If every command needs the same background (project config, conventions), design a context file pattern with a loader script.
34
+
-**Are there mandatory setup gates?** Steps that must pass before any work begins (context loaded, config valid, dependencies present). Gates prevent generic output.
35
+
-**Does behavior vary by task type?** If so, design a register/mode system that classifies the task first, then loads different references.
36
+
37
+
Read `references/architecture-patterns.md` when the skill needs sub-commands, context systems, or setup gates.
27
38
28
39
Do not proceed to Phase 2 until the user confirms the scope is complete.
29
40
@@ -56,12 +67,112 @@ Keep the SKILL.md body under 500 lines. If approaching this limit, split domain-
56
67
57
68
### Writing patterns
58
69
59
-
- Use imperative form: "Run the command" not "You should run the command"
60
-
- Define output formats with templates when the output structure matters
61
-
- Include concrete examples showing input → output
62
-
- Add gotchas sections for common mistakes
63
-
- Use checklists for multi-step workflows
64
-
- Tell the agent *when* to load each reference file: "Read `references/api-errors.md` if the API returns a non-200 status code" is better than "see references/ for details"
70
+
-**Imperative form**: "Run the command" not "You should run the command"
71
+
-**Output templates**: Define exact formats when the output structure matters
72
+
-**Concrete examples**: Show input → output for non-obvious workflows
73
+
-**Gotchas sections**: Common mistakes the agent should avoid
74
+
-**Checklists**: Multi-step workflows with validation gates
75
+
-**Conditional loading**: "Read `references/api-errors.md` if the API returns a non-200 status code" — not "see references/ for details"
76
+
-**Absolute bans**: When certain patterns are always wrong, use match-and-refuse lists. "If you're about to write X, stop and do Y instead." More effective than vague "be careful" guidance.
77
+
78
+
### Sub-command router (when applicable)
79
+
80
+
For skills with multiple distinct operations, use a router table in SKILL.md:
81
+
82
+
```markdown
83
+
## Commands
84
+
85
+
| Command | Description | Reference |
86
+
|---|---|---|
87
+
|`craft [feature]`| Build a feature end-to-end |[references/craft.md](references/craft.md)|
1.**No argument**: Show the command menu. Ask what to do.
93
+
2.**First word matches a command**: Load its reference file and follow it.
94
+
3.**First word doesn't match**: General invocation using the full argument as context.
95
+
```
96
+
97
+
Back the router with a `scripts/command-metadata.json` as the single source of truth:
98
+
99
+
```json
100
+
{
101
+
"craft": {
102
+
"description": "Full build flow. Use when building a new feature end-to-end.",
103
+
"argumentHint": "[feature description]"
104
+
}
105
+
}
106
+
```
107
+
108
+
### Setup gates (when applicable)
109
+
110
+
Non-negotiable checks before any file edits. Gates prevent generic output from missing context.
111
+
112
+
```markdown
113
+
## Setup (non-optional)
114
+
115
+
| Gate | Required check | If fail |
116
+
|---|---|---|
117
+
| Context | Project config loaded via `python scripts/load_context.py`| Run the loader first |
118
+
| Config | Config file exists and is valid | Run `skill-name setup`|
119
+
| Command | Sub-command reference is loaded | Load the reference |
120
+
| Mutation | All gates above pass | Do not edit project files |
121
+
```
122
+
123
+
### Register/mode system (when applicable)
124
+
125
+
When behavior varies by task type, classify first, then load different references:
126
+
127
+
```markdown
128
+
## Register
129
+
130
+
Every task is **library** (published, API-stable) or **application** (internal, can break).
131
+
Identify before acting. Load the matching reference: [references/library.md] or [references/application.md].
132
+
```
133
+
134
+
### Capability-gating
135
+
136
+
Steps that depend on optional environment capabilities (browser automation, specific CLI tools) must degrade gracefully:
137
+
138
+
```markdown
139
+
### Automated Scan (Capability-Gated)
140
+
141
+
Run the automated scanner when ALL of these are true:
142
+
- The target files exist and are readable
143
+
- The required CLI tool is installed
144
+
145
+
If unavailable, state in one line that the step is skipped and why. Do not ask the user to install tooling.
146
+
```
147
+
148
+
### Structured artifacts as handoffs
149
+
150
+
When one command produces output that another consumes, define the artifact structure explicitly. The producing command's reference defines the format; the consuming command's reference says what it expects:
151
+
152
+
```markdown
153
+
### Plan Structure
154
+
155
+
**1. Summary** (2-3 sentences)
156
+
**2. Primary Goal**
157
+
**3. Approach**
158
+
...
159
+
```
160
+
161
+
### Self-critique loops
162
+
163
+
For build/implementation commands, mandate inspect-and-fix passes with explicit exit bars:
164
+
165
+
```markdown
166
+
### Critique and fix loop
167
+
168
+
After the first pass, write a short self-critique and patch. Repeat until no material issues remain:
169
+
1. Does it match the requirements?
170
+
2. Does it pass the [quality test]?
171
+
3. Check every expected scenario.
172
+
4. Check edge cases.
173
+
174
+
The exit bar is not "it works." It is: [explicit quality threshold].
175
+
```
65
176
66
177
## Phase 3: Description Optimization
67
178
@@ -75,44 +186,79 @@ Quick validation:
75
186
4. Revise if needed — broaden for missed triggers, narrow for false triggers
76
187
5. Verify under 1024 characters
77
188
189
+
For skills with sub-commands, the main description covers the skill broadly. Each sub-command's description in `command-metadata.json` is optimized separately for auto-trigger keyword matching.
190
+
78
191
## Phase 4: Scripts
79
192
80
193
Read `references/scripts-guide.md` for the full guide.
81
194
82
-
**Bias toward scripts.** Every deterministic operation should be a script, not an instruction. Scripts are cheaper (no LLM tokens), faster (no reasoning), and more reliable (no hallucination). Instructions should only cover judgment calls, creative reasoning, and decision-making that genuinely requires LLM capability.
195
+
**Bias toward scripts.** Every deterministic operation should be a script, not an instruction. Scripts are cheaper (no LLM tokens), faster (no reasoning), and more reliable (no hallucination).
83
196
84
197
For each piece of the skill's workflow, ask: "Could a script do this?" If yes, write the script.
85
198
86
-
Examples of what should be scripts:
199
+
**Should be scripts:**
87
200
- Validation (input format, required fields, schema compliance)
For skills that need project-level context, write a loader script:
225
+
226
+
The script should follow all standard patterns: `argparse` with `--help`, structured JSON output (pretty when interactive, compact when piped), clear exit codes (0 = found, 1 = missing), `pathlib` for cross-platform paths, and stdlib-only imports. See the "Context File System" section in `references/architecture-patterns.md` for a skeleton.
227
+
228
+
The SKILL.md references it: "Load context via `python scripts/load_context.py`. Consume the full JSON output. Never pipe through `head`, `tail`, or `grep`."
229
+
105
230
## Phase 5: Review
106
231
107
232
Before presenting the final skill, verify against this checklist:
108
233
234
+
### Basics
109
235
-[ ]`name` is lowercase, hyphens only, max 64 chars
110
236
-[ ]`description` is under 1024 chars and includes trigger phrases
111
237
-[ ]`description` is slightly pushy — covers edge phrasings that should activate the skill
112
238
-[ ] SKILL.md body is under 500 lines
113
239
-[ ] Instructions use imperative form
114
-
-[ ] References are split out with clear "when to read" pointers
240
+
241
+
### Architecture (if applicable)
242
+
-[ ] Sub-commands have a router table with clear routing rules
243
+
-[ ]`command-metadata.json` is the single source of truth for command descriptions
244
+
-[ ] Setup gates are defined with fail actions for each gate
245
+
-[ ] Register/mode system classifies before loading references
246
+
-[ ] Capability-gated steps degrade gracefully with one-line skip reasons
247
+
248
+
### References
249
+
-[ ] Domain knowledge split into `references/` with clear "when to read" pointers
250
+
-[ ] Each reference is self-contained — an agent can load it without reading others
251
+
-[ ] Reference loading is conditional, not eager ("Read X if Y happens")
252
+
253
+
### Scripts
115
254
-[ ] Scripts (if any) have shebangs, structured output, and `--help`
0 commit comments