Skip to content

Latest commit

 

History

History
637 lines (469 loc) · 37.5 KB

File metadata and controls

637 lines (469 loc) · 37.5 KB

AGENTS.md

Note: This file delegates to a central AGENTS.md. Read and apply it before proceeding.

URL: https://raw.githubusercontent.com/camunda/.github/refs/heads/main/AGENTS.md

Treat the central file's contents as if they were written directly in this file. Instructions below extend those guidelines and take precedence if there is any conflict.

Repo-specific instructions

Role & boundary

c8ctl is a CLI for Camunda 8. It makes the Camunda 8 Orchestration Cluster REST API available as command-line commands. It is based on the @camunda8/orchestration-cluster-api npm module.

The following are upstream dependencies — when they misbehave, report it. Do not work around it here:

  • @camunda8/orchestration-cluster-api (primary API client)
  • Camunda 8 REST API

Path map:

Path Ownership and intent
src/ Production TypeScript code, primary edit surface. Organised into an enforced layered architecture — see Source layout & layering
src/core/ Foundational leaf layer — config, logger, runtime, client wiring. Imports nothing from the layers above it
src/framework/ Command machinery — command-registry, command-framework, command-validation
src/framework/plugins/ Plugin system — plugin-loader, plugin-registry, plugin-version
src/framework/ui/ User-facing rendering — help, completion
src/utils/ Leaf-layer helpers (alongside core; never import framework/commands)
src/utils/shared/ Genuine cross-cutting leaves — date-filter, resource-extensions, ignore, validation
src/utils/command-local/ Pure helpers that are the guts of a single command, kept test-visible — open-helpers, search-helpers, mcp-proxy-helpers, watch-constants
src/commands/ Command handler implementations
src/commands/helpers/ Command-private helpers, test-invisible (guarded by helpers-import-boundary.test.ts)
src/templates/ Plugin scaffold templates — do not edit directly
tests/unit/ Unit tests
tests/integration/ Integration tests (require live Camunda or Docker)
tests/fixtures/ BPMN/DMN test fixtures
default-plugins/ Built-in embedded plugins (JavaScript or TypeScript)
plugins/ GritQL lint and refactoring rules
assets/c8/rest-api/ OpenAPI backup reference — do not edit
.github/SDK_GAPS.md SDK gap tracking — check before implementing SDK features

Entry points: src/index.ts

Architecture

COMMAND_REGISTRY   →  metadata (flags, resources, help, validation)
defineCommand()    →  handler (receives typed flags + positionals)
COMMAND_DISPATCH   →  wiring (maps "verb:resource" to handler)

Key components:

  • src/framework/command-registry.ts — single source of truth: all commands are declared here with flags, resources, help text, validation, and shell completions
  • src/command-dispatch.ts — maps "verb:resource" keys to handler functions (composition root, top-level)
  • src/framework/command-framework.ts — provides defineCommand(), the CommandContext type, and the createDryRun() factory behind ctx.dryRun()
  • src/index.ts — CLI entry point; parses arguments, resolves profiles, routes to handlers (composition root, top-level)
  • src/core/config.ts — profile and session state: stores credentials, active profile, active tenant, output mode
  • src/core/logger.ts — text/JSON output rendering; isRecord() type guard lives here
  • src/core/runtime.ts — global runtime singleton (c8ctl) exposed to plugins; getUserDataDir() resolution
  • src/commands/ — per-resource command handler files
  • default-plugins/ — built-in embedded plugins (JavaScript or TypeScript; .ts files are typechecked and linted alongside src/)

Source layout & layering

src/ follows a strict, acyclic, enforced layered architecture (introduced in #414, sub-grouped in #427/#428). A file may only import from layers at or below its own:

composition root (src/index.ts, src/command-dispatch.ts)  →  commands  →  framework  →  core
                                                                          utils (leaf)
Layer May import from
core/ core
utils/ core, utils (leaf: never framework/commands)
framework/ core, utils, framework (never commands)
commands/ core, utils, framework, commands
composition root (src/index.ts, src/command-dispatch.ts) anything

This is guarded by tests/unit/layering-import-boundary.test.ts. The guard classifies a file by its first path segment, so sub-directories (framework/plugins/, framework/ui/, utils/shared/, utils/command-local/) inherit their parent layer — sub-grouping is free w.r.t. the guard. The composition root is pinned to an explicit allow-list (src/index.ts, src/command-dispatch.ts); any new top-level src/*.ts file fails the guard until it is deliberately classified.

Per-layer barrels (module encapsulation, #424)

The three barreled layers — core, utils, framework — each expose a single public entry point, <layer>/index.ts (export * over their internal files). The same guard enforces two further rules:

  • Rule A — cross-layer imports go through the barrel. A file importing into a barreled layer must target ../core/index.ts, never a deep file like ../core/logger.ts. This holds for commands/** and the composition root alike.
  • Rule B — intra-layer imports stay direct. A file within a barreled layer imports its siblings by direct path (./logger.ts), never via its own barrel — this keeps module-eval order obvious and avoids self-referential cycles. (The barrel file re-exporting siblings via ./file.ts is itself an intra-layer direct import, so it is allowed.)

commands/ is deliberately not barreled: nothing imports it cross-layer except the composition root, which is allowed to reach deep command files (it eager-loads every handler in command-dispatch.ts). To widen a barreled layer's public surface, add the symbol's source file to that layer's index.ts.

Two conventions worth internalising:

  • utils/shared/ vs utils/command-local/ — the split axis is "command-agnostic reusable leaf" vs "the guts of one command". command-local helpers stay test-visible (not moved into commands/**) deliberately: their logic is pure and cross-platform and is better covered by direct unit tests than by coarser c8() subprocess tests.
  • utils/command-local/ vs commands/helpers/ — both hold command-specific support code, but command-local is test-visible (tests import it directly) while commands/helpers/ is command-private and test-invisible (driven only through the c8() subprocess helper, per #291).
Per-invocation flags come from ctx, not the global runtime (#424)

The c8ctl runtime singleton (src/core/runtime.ts) stays — it is the irreducible plugin SDK exposed on globalThis.c8ctl (every default plugin and the scaffold template read it via globalThis.c8ctl), and src/core/** legitimately owns and wraps it (the core error handler and SDK client read c8ctl.verbose/c8ctl.dryRun directly; the composition root writes them). What shrank is its mutable surface as seen from the layers above core.

Per-invocation request state — whether --dry-run / --verbose were passed — is resolved once at the composition root (src/index.ts) and threaded into handlers through the typed CommandContext:

  • ctx.dryRun({ command, method, endpoint, profile, body? }) — the bound dry-run helper. Returns a DryRunResult when dry-run is active (return it to short-circuit), else null. Built by createDryRun(isDryRun) at the composition root. This replaces the old free dryRun() helper that read the global.
  • ctx.isDryRun — the raw boolean, for the handful of handlers that emit a custom dry-run payload (deploy, identity) rather than the framework's standard DryRunResult.
  • ctx.verbose — whether --verbose was passed.

src/commands/** and src/framework/** must obtain these from ctx (handlers) or an explicit parameter (non-handler framework entry points such as installCompletion/refreshCompletionsIfStale), and must not read c8ctl.dryRun or c8ctl.verbose back off the global. This is guarded by tests/unit/no-runtime-global-reads-in-handlers.test.ts (AST-based, so the field names are safe to mention in comments and strings). c8ctl.outputMode is session display state, not a per-invocation request flag — read it via logger.mode where a logger is in scope; the session command reads it directly because managing session state is its job.

Commit message guidelines

We use Conventional Commits.

Format:

<type>(optional scope): <subject>

<body>

BREAKING CHANGE: <explanation>

Allowed type values (common set):

feat
fix
chore
docs
style
refactor
test
ci
build
perf

Rules:

  • Subject length: 5–100 characters (commitlint enforces subject-min-length & subject-max-length).
  • Use imperative mood ("add support", not "added support").
  • Lowercase subject (except proper nouns). No PascalCase subjects (rule enforced).
  • Keep subject concise; body can include details, rationale, links.
  • Prefix breaking changes with BREAKING CHANGE: either in body or footer.

Review-comment fix-ups

Commits that address PR review comments must use the chore type (e.g. chore: or chore(<scope>):), not the fix type. fix commits (e.g. fix: or fix(<scope>):) trigger a patch release and a CHANGELOG entry — review iterations are not user-facing bug fixes.

# Correct
chore: address review comments — use logger.json for dry-run

# Wrong — will pollute the CHANGELOG
fix: address review comments — use logger.json for dry-run

Examples:

feat(worker): add job worker concurrency gating
fix(retry): prevent double backoff application
chore(ci): stabilize deterministic publish (skip spec fetch)
chore: address review comments — NUL-safe pre-commit hook
docs: document deterministic build flag
refactor(auth): simplify token refresh jitter logic

Build pipeline

Always-green policy

Before every AI-assisted session, verify CI is green:

npm test

Warnings are fatal. Do not suppress a warning to make a build pass. Do not treat any failure as pre-existing or unrelated without explicit confirmation from the engineer.

# Verify baseline -> always green (always run before an AI-assisted session)
npm test

# Fast inner loop (unit tests only) to iterate quickly
npm run test:unit

# Full pipeline before committing the change
npm run build && npm test

Never skip the lint and type-check steps before pushing.

  • only use Node.js 22 LTS and respect .nvmrc
  • this is a native Node.js project running TS files
  • there is no build step for development. Only compile for test purposes or release.
  • run npm run build before npm test — this enables the full test suite and prevents build-dependent tests from being skipped. It also catches compilation and type errors early.
  • on changes, make sure all tests pass and a build via npm run build works without errors

Local checks

  • npm run typecheck — runs tsc --noEmit -p tsconfig.check.json over src/ and tests/
  • npx biome check --fix — lints and formats src/ and tests/ per biome.json (includes the no-unsafe-type-assertion plugin)
  • npm run check:layering — runs the architectural layering guard (tests/unit/layering-import-boundary.test.ts) in isolation: layer-direction rules plus the per-layer barrel rules (#414/#424). Fast (~1s); use it to check import boundaries without running the whole unit suite.
  • npm run test:unit — fast unit tests (no live Camunda required)
  • .githooks/pre-commit — on commit, runs biome on staged files and typechecks a temporary tsconfig scoped to the staged set (transitive imports are still resolved). When any src/ file is staged it also runs the layering guard (check:layering) against the working tree, so import-boundary regressions are caught at commit time instead of in CI. Skips biome, tsc, or the guard individually if the toolchain is not installed locally.

Test process isolation — --experimental-test-isolation=none for integration tests

tests/unit/*.test.ts runs with the default per-file process isolation of node:test. tests/integration/*.test.ts runs with --experimental-test-isolation=none (single process for all integration files).

This is not a performance choice — it is a correctness choice. Do not remove the flag from test:integration without reading #312 and #182 first.

Background: per-file isolation spawns one subprocess per test file and structure-clones results back to the parent. This trips nodejs/node#56802 intermittently, surfacing as Error: Unable to deserialize cloned data due to invalid or unsupported version. The defect is in the IPC channel itself, so reducing parallelism (--test-concurrency=1, serialising files) does not fix it — only removing the IPC channel does.

The flag was originally added in #189, removed in 2bae796 (PR #282) on the assumption that Node 24.12.0 had fixed the underlying bug, and reinstated for the integration suite after the failure recurred. The Node bug is still open — verify with the upstream issue before removing the flag again.

The flag is not applied to test:unit because the unit suite has 66 files and isolation gives a 22s vs 10m+ wall-clock win. The IPC bug fires there too in principle, but the unit suite has been observed to be stable in practice and the perf delta is too large to give up. If test:unit ever starts hitting the same error, apply the same flag to it as well.

Implementation details

  • always make sure that CLI commands, resources and options are reflected in

  • for every implementation, make sure to add or update tests that cover the new functionality. This includes unit tests for individual functions and integration tests for end-to-end scenarios. Tests should be comprehensive and cover edge cases to ensure the robustness of the codebase.

  • in any test, only use the implemented CLI commands to interact with the system. Avoid using internal functions or direct API calls in tests, as this can lead to brittle tests that are tightly coupled to the implementation. By using the cli commands, you ensure that your tests are more resilient to changes in the underlying code and better reflect real-world usage.

  • don't use Promises in tests to wait for the overall system status to settle. Instead, use the polling helper from tests/utils/polling.ts to wait for specific conditions to be met.

Help output: flag scoping rule

c8ctl --help (top-level) lists only the flags in GLOBAL_FLAGS (src/framework/command-registry.ts). Verb- and resource-specific flags appear under c8ctl help <verb> only.

There is no opt-in mechanism for promoting a verb-specific flag into the top-level Flags section. The previous FlagDef.showInTopLevelHelp field and the (use with 'verb resource') parenthetical workaround were removed in #321 / #322 because they produced misleading output: a flag listed at the root looks like it applies to every command.

If you find yourself wanting to promote a flag into the top-level Flags section:

  • If the flag genuinely applies to every (or nearly every) command — add it to GLOBAL_FLAGS. It is now global.
  • If the flag is verb-specific but you want it discoverable — keep it verb-specific. If an example would help, add an entry to the verb's helpExamples in COMMAND_REGISTRY; those examples are aggregated into the top-level c8ctl --help output under the global Examples: section, not into c8ctl help <verb>.
  • Do not reintroduce a per-flag opt-in field on FlagDef. Class-scoped guards in tests/unit/help.test.ts (Top-level help is scoped to global flags (#321)) compare the rendered top-level Flags section against Object.keys(GLOBAL_FLAGS) and will fail if anything else leaks in.

Work environment

  • when you are not in "Cloud" mode, make sure to evaluate the OS environment and adapt behavior accordingly

  • prefer cross-platform solutions where reasonable

  • always consult the GitHub repository camunda/orchestration-cluster-api-js for API details and usage examples. It is the main source of truth for how to interact with the Camunda 8 Orchestration Cluster API. As a backup, a copy of the REST API documentation is available in OpenAPI format in the assets/c8/rest-api folder. As a last resort, you should refer to the npm module https://www.npmjs.com/package/@camunda8/orchestration-cluster-api.

  • always consult .github/SDK_GAPS.md for known SDK limitations before implementing features that interact with the Camunda SDK. When a newer SDK version is available, check whether gaps listed there have been resolved and update the file accordingly (mark resolved items, remove workarounds).

  • consult CONTEXT.md for CLI structure, resource aliases, and agent flags

  • consult EXAMPLES.md for command usage patterns

  • consult PLUGIN-HELP.md when working on the plugin system

TypeScript conventions

  • use modern TypeScript syntax and features
  • never use any — use unknown and narrow with type guards. Enforced by Biome (noExplicitAny, noImplicitAnyLet, noEvolvingTypes — all set to error)
  • never use as T type assertions — use type guards, narrowing, or satisfies instead. Enforced by a GritQL plugin (plugins/no-unsafe-type-assertion.grit) that applies to both src/ and tests/. Exceptions: as const and import renames are allowed. If a cast is genuinely unavoidable, add a // biome-ignore lint/plugin: comment with a justification and a tracking issue reference
  • run npx biome check to verify — biome.json scopes this to src/ and tests/. This runs as part of npm run build, CI, and the pre-commit hook (on staged files). Zero diagnostics required
  • run npx biome check --fix before committing to auto-fix formatting and lint issues
  • use modern Getter and Setter syntax for class properties. Examples:
class MyClass {
  private _myProp: string;
  get myProp(): string {
    return this._myProp;
  }
  set myProp(value: string) {
    this._myProp = value;
  }
}
  • prefer object-style function parameters over long parameter lists to be most flexible with parameter order. Example:
function createUser({ name, email, age }: { name: string; email: string; age: number }) {
  // function body
}
  • use logger.ts:isRecord(value) to narrow unknown to Record<string, unknown> (no as casts)
  • prefer functional programming over OOP where reasonable
  • prefer concise expressions over verbose control structures
  • when outputting errors, provide clear, concise and actionable hints to the user
  • pay attention to cross-platform compatibility (Linux, macOS, Windows). BUT only cater to WSL on Windows, no native Windows support.

BPMN and DMN validation

  • when creating or modifying .bpmn files, validate them with bpmnlint by running npx bpmnlint <file> before considering the task complete
  • when creating or modifying .dmn files, validate them with dmnlint by running npx dmnlint <file> before considering the task complete
  • fix any reported errors before proceeding

Terminal commands

  • when running terminal commands through an AI agent or other automation tool, avoid heredocs (<< EOF) because they don't work reliably in zsh on macOS
  • when using an AI agent or automation tooling, prefer its native file-editing capabilities for creating or modifying files
  • for appending single lines from the shell in those workflows, echo or printf is fine: echo "content" >> file.txt

Refactoring discipline

  • red/green refactor for new behaviour and bug fixes — write the failing test first, then the minimal production change that makes it pass. The test serves two roles simultaneously: it encodes the acceptance criteria for the change, and it becomes a permanent regression guard. Writing the test first proves it can actually detect the defect or the missing behaviour; if a test passes before the production change lands, it isn't guarding anything. For bug fixes, scope the test to the defect class, not just the instance, so the same category of bug can't recur in a sibling code path
  • behaviour tests are the regression guard — during behaviour-preserving refactors, do not modify behaviour tests. If a test fails, the production code is usually wrong, not the test. If a change intentionally modifies observable behaviour (for example CLI output, help text, or exit codes), update the affected behaviour tests and explicitly document and justify the intended behaviour change in the PR
  • between refactors, always run npm run typecheck (tsc --noEmit -p tsconfig.check.json, covering src/ and tests/), npx biome check, and npm run test:unit to verify correctness

There are no flaky tests

We do not acknowledge the existence of "flaky tests". A test that passes sometimes and fails other times is reporting one of two things:

  1. A test defect — the test contains a race, an unbounded timeout, an order-of-operation assumption, an unsynchronised readiness signal, or a dependency on wall-clock timing. Fix the test so its outcome is deterministic for the behaviour it claims to assert.
  2. A product defect — the production code has a race, a missed signal, an unhandled error path, or a resource it leaks under load. Fix the product.

Either way, an intermittent failure is a real defect that must be diagnosed and fixed before the change merges. Do not retry the CI job, mark the test skip, add a .retry(), or describe the failure as "flaky" or "unrelated" in the PR description. "Re-run and hope" is a coping strategy, not engineering.

When triaging an intermittent CI failure:

  • Reproduce locally if possible (loops, resource pressure, timeout reduction). If you cannot reproduce, reason from first principles about what could differ between local and CI (load, filesystem semantics, signal delivery latency, parallel test interaction).
  • Identify the specific race or assumption. Common shapes: polling for an output line that is printed before the relevant handler is registered; timeouts that double as correctness assertions; tests that share a temp directory across runs; tests that depend on event ordering across two processes.
  • Pick category 1 vs category 2 explicitly in the fix commit message, and explain which signal the test was previously relying on and which deterministic signal it now relies on.
  • If timeouts must be generous to absorb runner load, the timeout is a safety net — not a correctness signal. State this in a comment so future maintainers don't tighten it back into a race.

Coverage analysis before a behaviour-preserving refactor

Before starting any non-trivial refactor, audit whether the surface you are about to change is sufficiently guarded. A passing test suite is necessary but not sufficient — it only proves that what is currently tested still works. The risk of a refactor is the behaviour that nobody asserts.

Produce a short coverage table in the planning step that maps each behaviour you intend to preserve to the test that locks it in. For each row, ask:

  • Does an existing test fail if this behaviour changes? If not, the behaviour is unguarded.
  • Is the test scoped to the defect class (e.g. "all long-running handlers exit 0 on SIGINT") or only to one instance? Class-scoped guards are durable; instance-scoped guards rot.
  • For lifecycle / signal / process-exit behaviours, does any test actually exercise the signal? child.kill('SIGTERM') does not exercise a SIGINT handler.

For every gap, write the missing guard test first, on the pre-refactor branch, and prove it passes against the current implementation. This is the green/green discipline:

  1. Green on the pre-refactor code — proves the test encodes preserved behaviour, not aspirational behaviour.
  2. Green on the refactored code — proves the refactor preserved it.

Land the guard tests in a separate PR off main, and merge that PR to main before the refactor PR merges. A guard test that lands together with the change it is supposed to guard is weaker — there is no recorded moment at which it passed against the old code, so reviewers cannot tell whether it would have caught a regression.

If you find that the surface is genuinely unguardable without a major investment (for example, full end-to-end tests of mcp-proxy against a remote MCP server), record that gap in the PR description and shrink the refactor scope rather than proceeding without a net.

Command handler shape

Every command handler in src/commands/** follows the same shape. Pattern-matching on a nearby file is reliable only when every nearby file follows the canonical shape, so this section pins the invariants and lists the enforcement lints that protect them.

Canonical shape

import { defineCommand } from "../framework/index.ts";

export const myCommand = defineCommand("myverb", "my-resource", async (ctx, flags, args) => {
  const { client, profile, logger } = ctx;

  // 1. Dry-run check — must come BEFORE any I/O. Returns a DryRunResult
  //    if `--dry-run` was passed, or `null` to continue. The helper is
  //    bound to this invocation's flag and lives on `ctx`, so handlers
  //    never read dry-run state off the global runtime.
  const dr = ctx.dryRun({
    command: "myverb my-resource",
    method: "POST",
    endpoint: "/my-resources",
    profile,
  });
  if (dr) return dr;

  // 2. Validate inputs — throw typed errors. Never `process.exit(1)`.
  if (!args.key) throw new Error("Missing required argument: key");

  // 3. Optional intermediate progress for multi-step operations.
  logger.info(`Doing thing for ${args.key}…`);

  // 4. Do the work.
  const result = await client.doSomething({ key: args.key });

  // 5. Return a CommandResult — the framework renders it (text, JSON,
  //    field filtering). For commands that flow through framework
  //    rendering, prefer the typed kinds (`list`, `get`, `success`, …)
  //    over inline `logger.success` / `logger.json`. Side-effectful
  //    commands that handle their own output (e.g. `deploy`, `run`,
  //    `open` — multi-step progress + final summary) may return
  //    `{ kind: "none" }` and emit directly via `logger`. Long-running
  //    handlers return `{ kind: "never" }` (see below).
  return { kind: "get", data: result };
});

Long-running handlers (watch, mcp-proxy) return { kind: "never" } and resolve a lifecycle promise on SIGINT instead of returning a payload. They still throw on validation errors and let the framework own the exit code.

Before / after

// ❌ Legacy shape — bypasses framework error rendering and dry-run helper.
export async function deploy(paths: string[], options: DeployOptions) {
  if (options.dryRun) {
    emitDryRun({ command: "deploy", method: "POST", endpoint: "/deployments", profile: options.profile });
    return;
  }
  if (paths.length === 0) {
    logger.error("At least one path is required");
    process.exit(1); // ⛔ skips finally, breaks --verbose stack traces
  }
  // …
}

export const deployCommand = defineCommand("deploy", "", async (ctx, flags) => {
  await deploy(ctx.positionals, { profile: ctx.profile, dryRun: flags.dryRun });
});
// ✅ Canonical shape — body inline, ctx.dryRun() helper, throws on failure,
//    returns a CommandResult.
export const deployCommand = defineCommand("deploy", "", async (ctx, flags) => {
  const dr = ctx.dryRun({ command: "deploy", method: "POST", endpoint: "/deployments", profile: ctx.profile });
  if (dr) return dr;

  if (ctx.positionals.length === 0) {
    throw new Error("At least one path is required");
  }
  const summary = await deployResources(ctx.positionals, { profile: ctx.profile });
  return { kind: "success", message: `Deployed ${summary.count} resources` };
});

Enforcement (don't drift back)

The shape is enforced by three lints/tests. Some of these are scaffolded under follow-up PRs (#334, #336) — once those land alongside this doc, all references below resolve to files on main.

  1. No process.exit under src/commands/** — issue #289. Stable refs:
  2. All COMMAND_DISPATCH entries come from defineCommand() — issue #290 (closed). tests/unit/command-dispatch-structure.test.ts walks COMMAND_DISPATCH and rejects any entry that is not the marked output of defineCommand (via DEFINE_COMMAND_MARKER).
  3. Tests don't import handlers from src/commands/** (except type-only) — issue #291. Staged guard shipped in PR #336 at tests/unit/test-import-boundary.test.ts — uses a closed PENDING_MIGRATION allow-list (current violators) that can only shrink. Tests must drive commands via the c8() subprocess helper instead of importing handler internals.

When { kind: "never" } applies

Long-running handlers — currently watch and mcp-proxy — correctly return { kind: "never" }. The handler resolves a lifecycle promise on SIGINT after draining the event loop (closing watchers / sockets, aborting in-flight requests via AbortController, clearing pending timers) and the framework returns naturally with exit code 0. Do not call process.exit() from the SIGINT handler — let the event loop drain.

mcp-proxy additionally catches its own startup/shutdown errors and sets process.exitCode = 1 (rather than throw) because STDIO is the MCP protocol channel and re-entering the framework's stdout-emitting error path would corrupt the stream. This deviation is documented in src/commands/mcp-proxy.ts and pinned by tests/unit/mcp-proxy-error-paths.test.ts.

Adding a new command

Commands are defined declaratively. The COMMAND_REGISTRY in src/framework/command-registry.ts is the single source of truth — help text, shell completions, parseArgs options, and validation are all derived from it. No metadata is duplicated anywhere.

Architecture overview

COMMAND_REGISTRY   →  metadata (flags, resources, help, validation)
defineCommand()    →  handler (receives typed flags + positionals)
COMMAND_DISPATCH   →  wiring (maps "verb:resource" to handler)

Step-by-step

1. Declare the command in COMMAND_REGISTRY

Add or extend a verb entry in src/framework/command-registry.ts:

// In COMMAND_REGISTRY:
myverb: {
  description: "Short description shown in help",
  helpDescription: "Longer description for `c8ctl help` (optional, falls back to description)",
  mutating: true,              // true = write operation, false = read-only
  requiresResource: true,      // true = `c8ctl myverb <resource>`, false = `c8ctl myverb`
  resources: ["my-resource"],  // canonical resource names this verb accepts
  flags: {
    ...SEARCH_FLAGS,           // spread shared flag sets
    myFlag: {
      type: "string",
      description: "A custom flag",
      short: "m",              // optional single-letter alias
    },
  },
  // Optional: per-resource flag overrides
  resourceFlags: {
    "my-resource": MY_RESOURCE_SEARCH_FLAGS,
  },
  // Optional: positional arguments
  resourcePositionals: {
    "my-resource": MY_RESOURCE_POSITIONALS,
  },
  // Optional: help metadata
  hasDetailedHelp: true,
  helpFooterLabel: "Show myverb usage",
  helpExamples: [
    { command: "c8ctl myverb my-resource", description: "Do the thing" },
  ],
},
2. Define flag sets with as const satisfies

Flag sets must use as const satisfies to preserve concrete validator return types for InferFlags:

const MY_RESOURCE_SEARCH_FLAGS = {
  myKey: {
    type: "string",
    description: "Filter by key",
    validate: MyBrandedKey.assumeExists,  // branded type validator
  },
} as const satisfies Record<string, FlagDef>;

The validate function narrows the handler parameter type automatically — if it returns MyBrandedKey, the handler receives MyBrandedKey | undefined (no cast needed).

Positional arguments work the same way:

const MY_RESOURCE_POSITIONALS = [
  { name: "key", required: true, validate: MyBrandedKey.assumeExists },
] as const satisfies readonly PositionalDef[];
3. Add resource aliases (if new resource)

If introducing a new resource, add aliases in RESOURCE_ALIASES:

export const RESOURCE_ALIASES: Record<string, string> = {
  // ... existing aliases
  mr: "my-resource",
  "my-resources": "my-resource",  // plural form
};
4. Write the handler with defineCommand()

Create a handler file in src/commands/:

// src/commands/my-resource.ts
import { defineCommand } from "../framework/index.ts";

export const myverbMyResourceCommand = defineCommand(
  "myverb",
  "my-resource",
  async (ctx, flags, args) => {
    const { client, profile } = ctx;

    // flags.myFlag → string | undefined (inferred from FlagDef)
    // flags.myKey  → MyBrandedKey | undefined (inferred from validator)
    // args.key     → MyBrandedKey (required positional, branded)

    // Dry-run support (required for all commands)
    const dr = ctx.dryRun({
      command: "myverb my-resource",
      method: "POST",
      endpoint: "/my-resources",
      profile,
    });
    if (dr) return dr;

    const result = await client.doSomething({ key: args.key });

    // Return a CommandResult — the framework handles rendering
    return { kind: "get", data: result };
  },
);

CommandResult kinds:

Kind Use case Data
list Search/list results with items array { items, page?, sorting? }
get Single resource fetch { data }
raw Raw text output (XML, YAML, etc.) { content }
dry-run Dry-run preview { command, method, url, body? }
info Informational messages { message }
success Mutation confirmation { message }
no-result Nothing to display

The ctx.dryRun() helper checks the --dry-run flag (captured once at the composition root) and returns a DryRunResult if set, or null to continue. The raw boolean is also available as ctx.isDryRun for handlers that emit a custom dry-run payload (deploy, identity).

5. Register in COMMAND_DISPATCH

Add the handler to the dispatch map in src/command-dispatch.ts:

import { myverbMyResourceCommand } from "./commands/my-resource.ts";

export const COMMAND_DISPATCH: ReadonlyMap<string, AnyCommandHandler> = new Map([
  // ... existing entries
  ["myverb:my-resource", myverbMyResourceCommand],
]);

The key format is "verb:resource". For resourceless verbs (like deploy), use "verb:".

6. Add tests
  • Unit tests in tests/unit/ — test the handler via the CLI subprocess helper c8()
  • Behaviour testsc8('myverb', 'my-resource', '--dry-run') proves end-to-end dispatch
  • Help tests — verify the command appears in help output
  • Completion tests — the new command is automatically included in shell completions (derived from registry)

What you get for free

By adding the registry entry and dispatch wiring, these features are automatically derived:

  • c8ctl help includes the command with description, flags, and examples
  • c8ctl help myverb shows detailed help (if hasDetailedHelp: true)
  • c8ctl completion bash/zsh/fish includes the verb, resource, and all flags
  • parseArgs accepts the declared flags with correct types
  • Flag validation runs at the boundary (branded types enforced)
  • --dry-run support (via the ctx.dryRun() helper)
  • Output rendering (JSON, table, fields filtering) handled by the framework
  • Resource alias resolution (mrmy-resource) works everywhere

Resourceless commands

Some verbs don't take a resource (e.g. deploy, run, watch). Set requiresResource: false and resources: [], then register with an empty resource key:

// Registry
deploy: {
  description: "Deploy resources",
  mutating: true,
  requiresResource: false,
  resources: [],
  flags: { ...DEPLOY_FLAGS },
},

// Dispatch
["deploy:", deployCommand]