Skip to content

feat: LLM-summarized wake-up briefing (auto-detect invoker, cheap defaults, configurable) #165

@pszymkowiak

Description

@pszymkowiak

Problem

Current wake-up = static top-N memory bullets (~440 tokens). For projects with rich memory history (decisions, resolved errors, blockers, licensing stages, smoke test states), this is too shallow — agents have to fire icm_memory_recall to get useful context, adding latency and turn cost.

When the user asks "where are we?" / "what's missing?", the static bullets give a surface view; the real signal lives in the broader memory store.

Proposal

Replace (or augment) the static wake-up with an LLM-generated narrative briefing, compiled out-of-band whenever project memories change.

Flow

  1. Memory write → enqueue summarization job for the affected project
  2. Job reads all memories (project-scoped + relevant globals)
  3. LLM produces a structured briefing:
    • State of work (in flight, blocked)
    • Recent decisions and their rationale
    • Known errors / resolutions to remember
    • Project-specific user preferences
  4. Briefing cached at ~/.icm/cache/wake-up-{project}.md
  5. Session start loads cached briefing — zero added latency

Auto-detect the invoker AI tool

ICM should reuse the CLI already authenticated on the user's machine. No extra API key to manage.

Detection priority: | Prio | Mechanism | Notes |
|---|---|---|
| 1 | ICM_INVOKER env var | Explicit override (settable by hooks) |
| 2 | Tool-specific env vars | CLAUDECODE, CURSOR_TRACE_ID, GEMINI_CLI, CODEX_HOME, etc. |
| 3 | MCP clientInfo.name | When ICM is invoked via MCP |
| 4 | Parent process walk | Fallback (/proc/$PPID/comm on Linux) |
| 5 | Config TOML default | fallback_provider = "claude" |

Provider mapping (cheap defaults):

Invoker Command Default model
Claude Code claude -p --model <m> claude-haiku-4-5
Codex CLI codex exec --model <m> cheap codex equivalent (TBD)
Gemini CLI gemini -p --model <m> gemini-2.5-flash
Ollama HTTP API qwen2.5:7b
API direct Anthropic/Google/OpenAI SDK provider-specific cheap default

Why cheap defaults: summarization is a simple task. No need for Opus/Sonnet tier. User pays nothing extra in most cases (their own CLI quota).

Configuration (3 levels)

~/.config/icm/config.toml:

[wakeup.summarizer]
provider = "auto"              # auto | claude | codex | gemini | ollama | api
model = ""                     # empty = provider default
fallback_provider = "claude"
max_tokens = 1500
cache_dir = "~/.icm/cache"
regen_on_memory_write = true

CLI override: icm wakeup --summarizer-provider ollama --summarizer-model qwen2.5:7b
TUI: provider/model picker in icm interactive menu.
Per-project: .icm/config.toml in repo root.

Suggested PR breakdown (3 sub-PRs targeting v0.11)

PR 1 — Foundation (~500 LOC)

  • Provider trait
  • Claude CLI provider (only)
  • Auto-detect invoker (env vars + fallback, no MCP/pproc yet)
  • Config schema extension
  • Synchronous regen at session start (no cache yet)
  • MVP works end-to-end for Claude Code users

PR 2 — Cache & async hooks (~400 LOC)

  • File-based cache with hash-based invalidation
  • Memory write hook → enqueue async regen
  • Wake-up loads cache if present, falls back to static bullets
  • Zero added latency at session start

PR 3 — Multi-provider + TUI (~500 LOC)

  • Codex / Gemini / Ollama providers
  • TUI picker for summarizer config
  • CLI flags --summarizer-*
  • Docs + integrations updates

Non-goals

  • Replace icm_memory_recall — raw recall stays for deep dives
  • Real-time regeneration on session start (must be pre-computed)
  • Cross-project synthesis — scope is per-project briefings

Open questions

  • Invalidation: hash of all memory rows vs explicit dirty flag?
  • Should briefing include "delta since last session"?
  • Token budget: 1500 default, hard cap at 3000?
  • Fallback on LLM unreachable: stale cache vs static bullets vs error?
  • Cost guardrail: rate-limit regen (e.g., max 1/minute per project)?

Acceptance criteria

  • PR 1: Claude CLI provider + auto-detect + config + sync regen
  • PR 2: cache + memory-write hook + async regen
  • PR 3: Codex + Gemini + Ollama providers + TUI + docs
  • Smoke test: full flow with each provider
  • Fallback behavior verified (no LLM, stale cache, missing CLI)
  • Docs cover config + provider setup + opt-out

Milestone

Target: v0.11 (no milestone created yet — to be assigned).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions