Forward-looking sprint plan and active queue.
Current state: v0.50.281 | 3995 tests | port 8787 Target A (CLI parity): ✅ Complete Target B (Claude parity): ~95% — full subagent transparency UI and code-execution cells remain
Per-version detail: CHANGELOG.md Sprint history (chronology): ROADMAP.md
A sprint is a thematic batch — usually 3–8 PRs landed together as a release. Each sprint has:
- Theme — single-sentence framing of what changes
- Items — selected work, with tracking issue / PR
- Out of scope — what is explicitly deferred and why
- Risks — known sharp edges
- Retro — once shipped, what we learned
External contributor PRs that don't fit a planned sprint get released individually as patch versions (v0.50.X). Sprints are reserved for coherent batches where the items reinforce each other.
These are the issues currently labeled sprint-candidate — the inbox the next sprint plan draws from. Each item already has a confirmed root cause or design direction.
| # | Title | Type | Sprint fit |
|---|---|---|---|
| #1458 | Persistent-host crashes — bootstrap fork pattern, state.db FD leak, HTTP-unhealthy wedge | bug, stability | Stability sprint candidate |
| #1426 | OpenRouter free-tier :free variants invisible (tool-support filter) |
bug + feat | Model picker sprint candidate |
| #1362 | In-app OAuth login for Codex and Claude (currently terminal-only) | feat, ux | Onboarding sprint candidate |
| #1360 | macOS desktop app — auto-scroll overrides user scroll (#677 regression) | bug, ux | Desktop app polish |
| #1291 | GLM mutually exclusive between main agent and auxiliary title generation | bug | Auxiliary-route sprint |
The active queue stays small — it's the next 1-2 sprints' worth of work. The broader backlog of feature requests lives in ROADMAP.md → Forward Work and on the GitHub enhancement label.
Phase-0 fit assessment. Every sprint candidate gets a marginal-benefit screen first: does this make the product noticeably better for real users, or does it add surface area that costs more to maintain than it earns? Five-question fit screen — need / shape / bloat / clutter / scope.
Salvage over absorb. When a contributor PR is partial or scoped wrong, prefer splicing the good parts into a maintainer-side PR with Co-authored-by attribution rather than asking for multiple rebase rounds. Only absorb whole when the PR is genuinely shippable as-is.
Independent-review gate. Self-built PRs need either (a) Opus advisor pass on the merged stage diff, or (b) independent review from a separate reviewer. High-risk batches (large LOC, security, locks, durability) get both.
Per-PR release velocity. When a single PR fixes a real bug and has a clean review, ship it as its own patch release the same day rather than waiting for a sprint batch. Friction-free is the goal — sprint batches exist for coherence, not for arbitrary grouping.
No feature creep mid-PR. If a contributor proposes a scope addition during review, file it as a separate PR and link from the original. The current PR keeps its original boundaries.
Pre-release gate (mandatory). Every release runs:
pytest tests/ -q --timeout=120clean- Browser sanity check (HTTP-level API tests against a test server)
- Opus advisor pass on the merged stage diff with a written brief
- CHANGELOG.md + ROADMAP.md + TESTING.md version stamp
- CI green on Python 3.11 / 3.12 / 3.13
Skipping any of these requires a documented "I'm doing an override" from the maintainer.
A typical sprint runs 3-7 days end to end:
| Phase | Duration | Output |
|---|---|---|
| Triage | 0.5 day | Active queue → selected items + scope notes |
| Design / spike | 0.5–1 day | Design notes for items needing them; deferrals documented |
| Build | 2–4 days | Each item on its own branch + PR for independent review |
| Review | 0.5–1 day | Maintainer + Opus advisor passes; SHOULD-FIX absorbed in-release |
| Stage + ship | 0.5 day | Stage branch, full test suite, release PR, tag, deploy, verify live |
| Hygiene | 0.5 day | Close PRs / issues, GitHub release notes, docs sync, retro |
Smaller sprints (1–2 PRs) compress to 1–2 days end to end. Single-PR releases skip the stage branch and ship via a release PR directly off the contributor's branch (or a maintainer-rebased copy if the fork doesn't grant write access).
Per-version detail is in CHANGELOG.md. High-level theme chronology is in ROADMAP.md → Sprint History. A few notable sprints worth highlighting:
- Sprint 19 — auth + security hardening. First sprint that made the app safe to leave running beyond localhost.
- Sprint 21 — mobile responsive + Docker. Two-container compose enabled the first wave of self-host deployments.
- Sprint 22 — multi-profile support. Major CLI-parity unlock; profile switching is now seamless without server restart.
- Sprint 25 — macOS desktop application. Native Swift + WKWebView shell, universal Intel + Apple Silicon DMG, Sparkle 2 auto-update. Lives in the separate
hermes-webui/hermes-swift-macrepo. - Sprint 26 — pluggable themes. CSS-variable-driven 8-theme system that lets community contributors add themes as pure CSS.
- Sprint 34 — v0.50.0 UI overhaul. Composer-centric controls, Control Center modal, workspace state machine, rAF streaming throttle.
These are intentionally not on the roadmap. Listing them here to save planning cycles.
- Multi-user collaboration — single-user assumption throughout the codebase. Refactoring would be a from-scratch architecture change.
- Sharing / public conversation URLs — requires hosted backend with access control + CDN. Out of scope for self-hosted.
- Plugin marketplace — Hermes skills already cover this surface.
- Anthropic / Claude proprietary features — Projects AI memory, Claude artifacts sync. Not reproducible.
- Linux / Windows native app wrappers — macOS done; demand on other platforms not yet established. Web UI works in any browser.
- App Store distribution — sandboxing breaks the local-server model.
- Auto-update mechanism for the Python webapp — Sparkle 2 covers the Mac app; the webapp updates via
git pull+ restart, which is the same as every other Python service.
When opening a new sprint plan, copy this structure:
# Sprint NN — <theme>
**Version target:** vX.Y.Z
**Theme:** <single sentence>
**Date started:** YYYY-MM-DD
**Status:** PLANNED | IN PROGRESS | COMPLETED — vX.Y.Z
## Items
| # | Issue | Title | Complexity | Files | PR |
|---|-------|-------|------------|-------|-----|
## Rationale
Why these items, why now.
## Build approach
Per-item branch + PR; or single combined branch if items are tightly coupled.
## Out of scope
What did NOT make this sprint and why.
## Known risks
Sharp edges to watch during review.
## PR Status
| Issue | PR | Status |
|-------|-----|--------|
## Retro (post-ship)
What worked, what we'd do differently.The maintainer's planning notes for each sprint live in the workspace repo (private), not in this file. This file is the public-facing planning shape.