Releases: rpamis/comet
0.3.7
What's Changed
- feat: propagate workflow output language by @ddddddddwp in #53
- fix: prevent skip-all from skipping uninstalled components in comet init by @qiansanyu in #73
- fix(skills): enforce executing-plans review gate (#41) by @ddddddddwp in #76
- feat: add auto transition config by @Ninzero in #74
- feat: token optimization, context compression beta, and anti-drift guards by @benym in #78
Added
-
Auto-transition config: Added
auto_transition(true|false) to.comet.yamland the.comet/config.yamlproject default so teams can choose whether Comet automatically advances to the next phase skill or pauses for a manual transition. Whenauto_transition: false, build/design/open/verify skills stop after meeting exit conditions and print the next manual step instead of invoking the next skill. Includes state-machine whitelist, enum validation, and schema (comet-yaml-validate.sh) coverage (#74). -
Deterministic next-step resolver: Added
comet-state next <change-name>to resolve post-guard routing from.comet.yaml(phase,workflow,auto_transition) with structured output:NEXT: auto|manual|done,SKILL: <skill-name>, andHINT(manual mode). This centralizes next-skill routing logic in scripts instead of duplicating it across skill prose. -
Workflow output language: Comet workflows now propagate the triggering user request language into OpenSpec and Superpowers steps via an explicit Output Language Rule, keeping generated proposals, designs, plans, verification reports, and archive notes readable in the user's language. Resuming an existing change preserves the dominant artifact language unless the user explicitly asks to switch (#53, #37).
-
Execution benchmark (Claude Code): Added
benchmark:execution, a benchmark harness with three test phases: L1 (design doc generation from handoff context), L2 (build a note-board module from handoff context + run tests), and L3 (full workflow — implement a dictionary module from spec, run 10 vitest tests). Invokes Claude Code (claude -p) and measures actual test pass rate, token usage, retry count, duration, and cost. Comparesoffvsbetacontext compression modes across small/medium/large tiers. Supports--phase l1|l2|l3|both|alland--dry-runfor deterministic verification. Extracted shared utilities (spawnCapture,parseClaudeJson,buildClaudeArgs, etc.) toscripts/benchmark-utils.mjs. -
Token optimization: TDD skill single load: Build skill now loads
test-driven-developmentskill once before the first task (instead of per-task), reducing ~44K tokens per 10-task workflow. Includes compaction recovery guidance to reload once on resume. -
Token optimization: brainstorming checkpoint: Design skill now writes
brainstorm-summary.mdafter user confirms design approach, providing a compaction recovery point that preserves confirmed decisions across context window compression. -
Token optimization: incremental brainstorming checkpoint: Design skill now incrementally updates
brainstorm-summary.mdduring brainstorming, preserving confirmed facts, candidate decisions, risks, testing notes, and pending questions before platform-driven context compaction can occur. -
Token optimization: active compaction gate: Design skill now requires an active context compaction gate after
brainstorm-summary.mdis finalized and before creating the Design Doc, using the host platform's native compaction mechanism when available and falling back to a manual user prompt when it is not. -
Token optimization: plan creation subagent offload: Build skill offloads
writing-plansexecution to a subagent, freeing main session context. Subagent reads Design Doc + tasks.md from files and returns the plan file path. Falls back to inline execution on subagent failure. -
Token optimization: verification skill dedup: Verify skill loads
verification-before-completiononce before the light/full branch point instead of in each branch, eliminating redundant skill content. -
Token optimization: tasks.md incremental scan: Build skill uses
grepto find unchecked tasks instead of re-reading the entiretasks.mdfile after each task completion. -
Token optimization: hash on-demand read in verify: Verify skill checks
handoff_hashbefore re-reading OpenSpec artifacts. When hash matches, onlytasks.mdis skipped (proposal.md and design.md are still read for comparison checks). Uses newcomet-handoff.sh --hash-onlyflag. -
--hash-onlyflag for comet-handoff.sh: New backward-compatible flag outputs the context hash without generating handoff files, used by verify phase for hash comparison. Validates required files exist before computing hash. -
CodeGraph integration in comet init:
comet initnow offers an optional step to install and configure CodeGraph (@colbymchenry/codegraph) for semantic code intelligence. It auto-detects supported platforms (Claude Code, Cursor, Codex, OpenCode, Gemini, Kiro, Antigravity), installs the CLI if missing, runscodegraph installfor agent wiring, and initializes the project index. Skips gracefully under--jsonmode. -
Stale PR automation: Added a scheduled and manually runnable GitHub Actions workflow that marks inactive pull requests stale after 90 days and closes them after another 30 days, helping keep long-idle review queues manageable.
-
TDD mode field: Added
tdd_mode(tdd|direct) to.comet.yamlstate machine so users choose whether to enforce TDD during build. Whentdd_mode: tdd, subagent dispatches inject an explicit TDD hard constraint, bypassing implementer-prompt.md's conditional trigger. Addresses #67. -
subagent_dispatch field: Added
subagent_dispatch(null|confirmed) to.comet.yamlstate machine, ensuringbuild_mode: subagent-driven-developmentcan only leave the build phase after the platform's real background dispatch capability is confirmed. -
Verify retry limit: Verify skill now enforces a mandatory user decision after 3 consecutive verify-fail cycles, preventing indefinite automated retry loops.
-
Manual verify_mode override: Users can override automatic verification scale assessment via
comet-state set <name> verify_mode <light|full>when the auto-detected mode doesn't fit. -
Local context compression benchmark: Added
benchmark:context, a local Codex benchmark harness that creates matchedcontext_compression: offandbetaComet fixtures, runscodex execagainst each mode, and reports token savings, spec drift rate, task completion rate, parse success, and timing. Use--dry-runfor deterministic non-Codex verification. -
Beta-gated context compression switch: Project installs now create
.comet/config.yamlwithcontext_compression: off, allowing teams to opt new changes into beta spec projection by settingcontext_compression: beta. This switch controls only the OpenSpec handoff projection path (spec-context.*); the workflow token optimizations above are default-on and do not require beta mode. -
Beta spec projection handoff:
/comet-designcan now use beta context compression to generatespec-context.jsonandspec-context.md, preserving OpenSpec requirement and scenario headings with source hashes so compact design handoffs reduce token load without weakening acceptance coverage.
Changed
- executing-plans review gate: When
build_modeisexecuting-plans, the build phase now requires loading the Superpowersrequesting-code-reviewskill and requesting code review at least once before the build→verify phase guard. CRITICAL findings must be fixed before verify; accepted non-CRITICAL findings must record acceptance rationale in a durable artifact. The build-exit checklist enforces this gate (#76, #41). - Phase advancement vs handoff wording: Chinese and English Comet skills now consistently distinguish guard-driven phase advancement (
--apply, always updatesphase) from next-skill invocation control (auto_transition). Open/design/build/verify/hotfix/tweak guidance now routes throughcomet-state nextfor auto/manual handoff. - Preset continuity wording: Hotfix and tweak guidance now explicitly documents the
auto_transition: falseexception in continuous execution mode, removing contradictory wording around "always continue" behavior. - Verify hash-skip scoped to tasks.md only: Full verification always reads
proposal.mdanddesign.mdeven when hash matches, ensuring goal-satisfaction and design-consistency checks have complete context. - Design Doc creation stays in main session: Design Doc is created inline (not offloaded to subagent) to preserve full brainstorming conversation context and prevent information loss for complex requirements.
- Subagent failure fallback: Plan creation subagent offload includes explicit degraded fallback — if the subagent fails, the main session loads
writing-plansinline. - Beta spec verbatim projection: Beta context compression now projects entire spec files verbatim (
cat) instead of filtering by English keywords (GIVEN/WHEN/THEN/AND/BUT). This eliminates language-dependent matching, ensures zero acceptance-criteria drift for Chinese or non-English specs, and removes the fragile AWK filter entirely. - JSON structural validation:
comet-guard.shnow validatesspec-context.jsonstructure (required fields:change,phase,mode,files,context_hash) and source file reference coverage, rep...
0.3.6
What's Changed [0.3.6] - 2026-06-02
Added
- Plan-ready build pause state: Added
build_pauseas a dedicated build-phase pause marker so Comet can stop after plan generation without confusing the pause with the actual execution method. - Plan-ready pause design: Added a design record for the model-switching pause workflow, covering recovery behavior, stale pause handling, and plan-missing remediation.
Changed
- Build recovery routing:
/cometand/comet-buildnow recognizebuild_pause: plan-ready, reuse the existing plan, and resume at workspace isolation and execution-method selection instead of regenerating the plan. - Bilingual workflow documentation: Chinese and English Comet skills now describe the plan-ready pause point, clarify that
build_pauseis notbuild_mode, and document the same state field in both README files.
Fixed
- GitHub Copilot Superpowers skill names: Comet skills now invoke the bare Superpowers skill names installed by the GitHub Copilot skills path, avoiding blocked workflows caused by unresolved
superpowers:*aliases. - Windows bash resolution: Comet now resolves a usable bash executable through
COMET_BASH, rejects the Windows WSL launcher path, and uses the resolved executable for nested script calls so guard, handoff, and archive flows do not fall back to a broken PATHbash. - Shell test runner bash resolution:
run-bats.jsnow resolves a usable bash throughCOMET_TEST_BASH,COMET_BASH, PATH, or Git Bash defaults, avoiding the broken Windows WSL launcher when running shell tests from Node. - Schema validation fatal output: Guard validation now preserves the final fatal schema-validation message after printing validator diagnostics, making invalid
.comet.yamlfailures easier to recognize.
Tests
- Superpowers skill invocation regression: Added coverage that shipped Comet skill prose does not reference plugin-prefixed Superpowers aliases.
- Comet bash execution regression: Added coverage for nested script calls, shipped command examples, and the shell test runner so Comet uses resolved bash paths instead of raw PATH
bash. - Plan-ready pause regression: Added shell-script coverage for
build_pauseinitialization, schema validation, state updates, and build recovery output. - README state-field regression: Added README coverage to ensure
build_pauseappears in examples and field descriptions for both English and Chinese documentation.
0.3.5
What's Changed [0.3.5] - 2026-05-30
Added
- Context compaction recovery (
--recover):comet-state check <name> <phase> --recoveroutputs a structured recovery context, including phase status, field progress, task count, and recovery actions, used for agent context compression to quickly locate breakpoints and resume operations. - Red Flags Anti-Rationalization List: Added 5 red flag warnings to the main scheduling skill (making decisions for the user, skipping confirmation, replacing historical preferences, agreeing without objection, and passing without verification), helping the agent identify its own overreach tendencies.
- Uncertainty Degradation Principles: Added SUGGESTION > WARNING > CRITICAL degradation rules to the verify skill. Only build failures, test failures, and security issues are marked CRITICAL; ambiguous issues must be downgraded.
- Anti-Automatic Selection Guardian: Added naming and scope anti-automatic selection rules to the open skill. Name changes must be specified by the user or AskUserQuestion. Confirmation: The scope cannot be expanded or narrowed arbitrarily.
- File Existence Verification: Before entering user confirmation, the open skill verifies that the proposal/design/tasks files are not empty, preventing empty files from skipping the check.
- Idempotency Description: Idempotency descriptions have been added to all skill stages (open/design/build/verify), clarifying which operations can be safely retried and which fields require confirmation before skipping.
Changed
- AskUserQuestion Tool Clarification: All 7 decision blocking points (open confirmation, brainstorming confirmation, build workflow, verify failure decision, spec drift handling, branch handling, upgrade conditions) are uniformly required to use the AskUserQuestion tool; plain text prompts are prohibited.
- Decision Points Expanded from 6 to 7: The open stage proposal/design/tasks review confirmation is now the first decision point.
- Spec Drift Single-Choice Question Format: Spec drift handling in the verify stage has been changed to an AskUserQuestion single-choice question (A/B/C). (Choose one of three), no longer implicit default option
- Completely synchronized Chinese and English skills: The content, structure, and option format of the 7 Chinese skills and 7 English skills are completely aligned.
Fixed
- Crash due to unbound variables in
set -u: Whencomet-state check --recoveris missingtasks.mdduring the build phase, thependingvariable is not declared, causing the script to exit directly; this is fixed by moving thelocaldeclaration forward and adding an explicit branchtasks.md MISSINGto the recovery action chain. - Path truncation risk:
field_statususing${var%% *}ondesign_docmay truncate paths containing spaces; changed to${var% }to only remove trailing spaces. - Inconsistent reading style for optional fields:
direct_overrideuses|| echo ""while other optional fields use|| true; unified to|| trueto be consistent withcmd_scale.
Tests
- Added 8
check --recoverand boundary test cases, covering five phases: open/build/verify/design/archive, as well as boundary scenarios such as missing tasks.md and all tasks completed. - Total number of tests increased from 34 to 42, all passed.
0.3.4
What's Changed [0.3.4] - 2026-05-29
Changed
- Command execution security: Refactored all command execution in OpenSpec and Superpowers install paths from
spawnwith shell interpretation toexecFileSync, eliminating shell injection surface and improving cross-platform reliability
Fixed
- OpenSpec global install path for OpenCode:
comet init --scope globalnow migrates OpenSpec skills from the hardcoded~/.opencode/directory to~/.config/opencode/where OpenCode actually reads them, with a self-deletion guard when source and destination paths coincide (#46, @gleami) - Windows command execution: Added
shelloption toexecFileSynccalls on Windows so command shims (.cmd) resolve correctly - Doctor
.comet.yamlvalidation:comet doctornow validates top-level keys instead of silently accepting unknown keys, andreadDirerrors other than ENOENT are no longer swallowed (@felanny) - CI JSON parsing: CI workflow parses command output by finding the first
{character, preventing non-JSON prefix lines from breaking JSON extraction (@kathy32) - CI warning output: CI now only counts and prints warnings when a step actually fails, reducing noise in successful runs (@kathy32)
- Spawn stdio noise: Changed
inherittoignorefor non-interactive spawn stdio so OpenSpec/Superpowers installers don't print unrelated progress to the console (@kathy32)
Tests
- Added coverage for OpenCode global OpenSpec path migration, self-deletion guard, and homedir mocking
- Added doctor tests for
.comet.yamltop-level key validation and non-ENOENTreadDirerror propagation - Fixed timeout for git-based test "uses plan base-ref to scale verification"
Docs
- Improved README setup guidance with clearer installation instructions and collapsible reference panels (both English and Chinese) (@bevishe)
- Added contributors wall to both README and README-zh (@Joechan11)
New Contributors
0.3.3
What's Changed [0.3.3] - 2026-05-27
Fixed
- OpenSpec all-workflows installation:
comet initnow writes the all-workflows config directly to the platform-specific default config path (%APPDATA%\openspec\on Windows,$XDG_CONFIG_HOME/openspec/on macOS/Linux when set, otherwise~/.config/openspec/) in addition to the isolatedXDG_CONFIG_HOMEenv override, ensuring all 11 OpenSpec workflows are always installed regardless of the user's previous OpenSpec config state.
0.3.2
What's Changed [0.3.2] - 2026-05-27
Added
- Script discovery helper: New
comet-env.shcentralizes script path resolution by sourcing sibling scripts from its own directory, replacing the scatteredCOMET_SEARCH_ROOTSfind logic across all English and Chinese skills. - OpenCode global config directory: OpenCode platform now supports a separate
globalSkillsDir(.config/opencode) for global installs, keeping project and user-level skills distinct. - Command error diagnostics: New
command-error.tsmodule extracts and cleans stderr/stdout from failed shell commands, used by both OpenSpec and Superpowers install paths to surface actionable failure details.
Changed
- Build decision-point wording: Strengthened the build skill's workspace-isolation and execution-method selection wording so agents cannot choose on behalf of the user based on recommendation rules.
- Hotfix/Tweak upgrade wording: Reworded upgrade-condition and verification-failure pause requirements in hotfix and tweak skills for clearer blocking semantics.
- Comet user decision numbering: Fixed out-of-sequence numbering in the Chinese comet skill's user decision point list.
Fixed
- OpenSpec workflow installation:
comet initnow runs OpenSpec with--profile customand a temporary config that enables all workflows (propose,explore,new,continue,apply,ff,sync,archive,bulk-archive,verify,onboard), ensuring Comet installs more than the default core workflow set. - OpenCode slash commands:
comet initnow generates OpenCode command files (commands/*.md) that keep the/comet*command names while embedding the corresponding Comet workflow content, so OpenCode users can invoke/comet,/comet-open, etc. directly. - Lingma Superpowers path:
comet initnow keeps Lingma out of the unsupportedskills --agent lingmapath and copies staged Superpowers skills into.lingma/skills, preventing the whole external installer batch from failing while preserving Lingma's expected directory layout. - Lingma global directory: Lingma's global skills directory is explicitly
.lingma, matching~/.lingma/skills/{skill-name}/SKILL.mdfor user-level installs and.lingma/skills/{skill-name}/SKILL.mdfor project installs. - Script discovery safety:
comet-env.shno longer changes caller shell options when sourced, returns failure when bundled scripts are missing, and avoids ShellCheck unreachable-command diagnostics. - comet-state.sh field whitelist: Added
created_atandbase_refto thecmd_setallowed fields list, aligning validation with fields already written during.comet.yamlinitialization.
Tests
- Script discovery coverage: Added tests verifying
comet-env.shexports all bundled script paths and that no skill file inlinesCOMET_SEARCH_ROOTS. - Script discovery safety: Added regression coverage for sourced shell option preservation and expandable
$HOMEskill-directory globs. - OpenCode Comet detection: Added tests for OpenCode requiring both skill directories and matching command files before reporting Comet as installed.
- OpenCode E2E init: Added end-to-end tests for OpenCode project and global scope installs, including command file generation.
- OpenCode command content: Added tests that OpenCode command files preserve Comet command names and include full selected-language workflow content instead of a thin skill-delegation stub.
- English workflow safeguards: Added parity tests matching the existing Chinese workflow decision-point requirements.
- OpenSpec profile and diagnostics: Added tests for custom profile creation,
--profile customflag, and stderr/stdout detail printing on install failures. - Lingma Superpowers fallback: Added regression coverage that Lingma is excluded from the unsupported skills CLI agent list and uses a staging install before copying skills to
.lingma. - Lingma global install path: Added regression coverage for
comet init --scope globalinstalling Lingma Comet skills under the user.lingma/skillsdirectory.
0.3.1
What's Changed [0.3.1] - 2026-05-26
Added
- Workflow state metadata:
.comet.yamlinitialization now recordsbase_refandcreated_atso scale assessment and validation can reason from a stable change baseline.
Changed
- Comet decision points: Clarified Chinese and English workflow skills so design confirmation, build configuration, verification failures, spec drift, branch handling, and preset upgrades pause for explicit user choice instead of relying on defaults or recommendations.
- Build workflow selection: Combined workspace isolation and execution-method selection into one build configuration step, reducing repeated pauses while still requiring
isolationandbuild_modebefore implementation can continue. - Hotfix verification flow: Moved root-cause elimination before the build guard and requires preset upgrades to switch
workflowtofull, keeping failed hotfix checks in the build phase and full-flow upgrades in a consistent state. - Verification scale assessment: Scale checks now fall back to
.comet.yamlbase_refand use a four-file threshold for full verification, making committed build changes less likely to be undercounted. - English skill parity: Synced English Comet skills with the Chinese workflow rules, including handoff generation, dirty-worktree handling, spec drift decisions, and verification failure blocking.
Fixed
- Windows npm update:
comet updatenow spawns npm through the shell so the package update path works reliably with Windows command shims. - Superpowers install diagnostics: Failed Superpowers installs now print cleaned stderr details, making network or GitHub access failures visible instead of hiding the actionable cause.
Tests
- Workflow safeguard coverage: Added regression coverage for Chinese Comet decision-point requirements and Superpowers install failure diagnostics.
0.3.0
What's Changed [0.3.0] - 2026-05-25
Added
- Dirty worktree recovery protocol: Added shared English and Chinese
comet/reference/dirty-worktree.mdreferences so agents consistently protect, inspect, and attribute user or mixed-source working tree changes during resume
Changed
-
Comet resume behavior: Updated
/comet, build, verify, hotfix, and tweak skills so manual code edits made during interruptions are treated as code evidence, not automatic state transitions; agents must attribute dirty worktree changes before continuing or advancing guards -
Reference skill installation: Added the dirty worktree reference file to the Comet manifest so installed English and Chinese skill sets can resolve
comet/reference/dirty-worktree.md
0.2.9
What's Changed [0.2.9] - 2026-05-24
Changed
- Antigravity skill paths: Updated platform handling so project-scope installs use
.agents/skillswhile global installs use Antigravity's.gemini/antigravity/skillslocation, keepinginit,doctor, andupdatealigned with Antigravity's directory model - README information architecture: Reworked English and Chinese README sections so command details, platform lists, skill tables, script tables,
.comet.yamlfields, and reliability notes are available in collapsible reference panels - Spec lifecycle documentation: Expanded the README explanation of Comet's Spec lifecycle management, including OpenSpec/Superpowers artifact linking, automated handoff, state updates, validation, and archive sync
- Security guidance location: Moved repository maintenance security notes from README into
CONTRIBUTING.md, keeping the README focused on user-facing Comet concepts and setup
Fixed
- Antigravity global installs: Fixed
comet init --scope globaland related health checks so Antigravity no longer installs or searches global skills under the project-style.agentsdirectory - Missing skills directories: Added explicit existence checks before scanning project and global skills directories, keeping detection and update logic robust when platform directories exist without
skills/
Tests
- Antigravity path coverage: Added regression coverage for Antigravity project/global skill directories across detection and init E2E behavior
- README structure coverage: Verified the updated README command and reference structure with the existing README test suite
0.2.8
What's Changed [0.2.8] - 2026-05-24
Added
- Design handoff script: New
comet-handoff.shgenerates deterministic, source-traceable context packages (compact or full mode) from OpenSpec artifacts into.comet/handoff/, recordinghandoff_contextandhandoff_hashin.comet.yaml - Handoff guard checks: Design phase guard now validates handoff context existence, hash freshness (detects post-handoff OpenSpec mutations), markdown traceability markers, and design doc frontmatter fields (
comet_change,role: technical-design,canonical_spec: openspec) handoff_contextandhandoff_hashfields: New.comet.yamlfields for tracking script-generated handoff packages, with schema validation (path existence, sha256 hex digest format)comet init --scope: New--scope <global|project>CLI flag for non-interactive scope selection- CI init E2E job: GitHub Actions now runs real
comet initon Ubuntu, macOS, and Windows, verifying Comet skills, Superpowers, OpenSpec, and working directories land in correct filesystem locations for both project and global scope
Changed
- Chinese skill docs updated:
comet-design/SKILL.mdandcomet/SKILL.mdnow document the handoff flow, replacing agent-authored summaries with script-generated context packs - JSON generation uses process substitution:
write_json_contextincomet-handoff.shuses< <(source_files)instead of pipe subshell, fixing variable scoping - Error message formatting:
comet-state.shunknown-field error message split from a single 270+ character line into multiple lines for readability - CLAUDE.md and AGENTS.md: Added project-level instructions covering test commands, shell script conventions, script dependency graph,
.comet.yamlstate machine sync rules, and changelog format
Fixed
- YAML and frontmatter parsing: Comet scripts now ignore unquoted trailing comments in
.comet.yamlfield values and accept Design Doc frontmatter after a UTF-8 BOM or leading blank lines, preventing false guard and handoff failures - Init E2E install checks: CI now verifies Comet-owned skill artifacts in every supported platform directory and checks OpenSpec/Superpowers installer status from
comet init --jsonfor both project and global installs, avoiding false failures from external CLI-specific directory layouts - Windows global init E2E home directory: CI now sets
USERPROFILEalongsideHOMEfor global-scope init checks on Windows, matching Node'sos.homedir()resolution and preventing false missing-skill failures - README state documentation: README examples now show accurate
.comet.yamlbuild-state defaults, verification evidence timing, handoff fields, and project-only working directory creation
Tests
- Added coverage for
--fullhandoff mode, missing OpenSpec artifacts rejection, post-handoff hash mismatch detection, and design doc frontmatter validation - Added
comet initE2E tests covering project scope install, global scope install, skip-existing with--yes, overwrite with--overwrite, and multi-platform detection - Added regression coverage for
.comet.yamltrailing comments and Design Doc frontmatter with a UTF-8 BOM or leading blank lines - Added CI workflow regression coverage for project and global installation checks across Comet-owned files and external OpenSpec/Superpowers installer statuses
- Added CI workflow regression coverage for Windows global init using the temporary
USERPROFILEhome directory