feat: LLM-summarized wake-up briefing (auto-detect invoker, cheap defaults, configurable)

## Problem

Current wake-up = static top-N memory bullets (~440 tokens). For projects with rich memory history (decisions, resolved errors, blockers, licensing stages, smoke test states), this is too shallow — agents have to fire `icm_memory_recall` to get useful context, adding latency and turn cost.

When the user asks "where are we?" / "what's missing?", the static bullets give a surface view; the real signal lives in the broader memory store.

## Proposal

Replace (or augment) the static wake-up with an LLM-generated narrative briefing, compiled out-of-band whenever project memories change.

### Flow

1. Memory write → enqueue summarization job for the affected project
2. Job reads all memories (project-scoped + relevant globals)
3. LLM produces a structured briefing:
   - State of work (in flight, blocked)
   - Recent decisions and their rationale
   - Known errors / resolutions to remember
   - Project-specific user preferences
4. Briefing cached at `~/.icm/cache/wake-up-{project}.md`
5. Session start loads cached briefing — zero added latency

## Auto-detect the invoker AI tool

ICM should reuse the CLI already authenticated on the user's machine. No extra API key to manage.

**Detection priority:** | Prio | Mechanism | Notes |
|---|---|---|
| 1 | `ICM_INVOKER` env var | Explicit override (settable by hooks) |
| 2 | Tool-specific env vars | `CLAUDECODE`, `CURSOR_TRACE_ID`, `GEMINI_CLI`, `CODEX_HOME`, etc. |
| 3 | MCP `clientInfo.name` | When ICM is invoked via MCP |
| 4 | Parent process walk | Fallback (`/proc/$PPID/comm` on Linux) |
| 5 | Config TOML default | `fallback_provider = "claude"` |

**Provider mapping (cheap defaults):**

| Invoker | Command | Default model |
|---|---|---|
| Claude Code | `claude -p --model <m>` | `claude-haiku-4-5` |
| Codex CLI | `codex exec --model <m>` | cheap codex equivalent (TBD) |
| Gemini CLI | `gemini -p --model <m>` | `gemini-2.5-flash` |
| Ollama | HTTP API | `qwen2.5:7b` |
| API direct | Anthropic/Google/OpenAI SDK | provider-specific cheap default |

Why cheap defaults: summarization is a simple task. No need for Opus/Sonnet tier. User pays nothing extra in most cases (their own CLI quota).

## Configuration (3 levels)

`~/.config/icm/config.toml`:

```toml
[wakeup.summarizer]
provider = "auto"              # auto | claude | codex | gemini | ollama | api
model = ""                     # empty = provider default
fallback_provider = "claude"
max_tokens = 1500
cache_dir = "~/.icm/cache"
regen_on_memory_write = true
```

CLI override: `icm wakeup --summarizer-provider ollama --summarizer-model qwen2.5:7b`
TUI: provider/model picker in `icm` interactive menu.
Per-project: `.icm/config.toml` in repo root.

## Suggested PR breakdown (3 sub-PRs targeting v0.11)

**PR 1 — Foundation (~500 LOC)**
- `Provider` trait
- Claude CLI provider (only)
- Auto-detect invoker (env vars + fallback, no MCP/pproc yet)
- Config schema extension
- Synchronous regen at session start (no cache yet)
- MVP works end-to-end for Claude Code users

**PR 2 — Cache & async hooks (~400 LOC)**
- File-based cache with hash-based invalidation
- Memory write hook → enqueue async regen
- Wake-up loads cache if present, falls back to static bullets
- Zero added latency at session start

**PR 3 — Multi-provider + TUI (~500 LOC)**
- Codex / Gemini / Ollama providers
- TUI picker for summarizer config
- CLI flags `--summarizer-*`
- Docs + integrations updates

## Non-goals

- Replace `icm_memory_recall` — raw recall stays for deep dives
- Real-time regeneration on session start (must be pre-computed)
- Cross-project synthesis — scope is per-project briefings

## Open questions

- Invalidation: hash of all memory rows vs explicit dirty flag?
- Should briefing include "delta since last session"?
- Token budget: 1500 default, hard cap at 3000?
- Fallback on LLM unreachable: stale cache vs static bullets vs error?
- Cost guardrail: rate-limit regen (e.g., max 1/minute per project)?

## Acceptance criteria

- [ ] PR 1: Claude CLI provider + auto-detect + config + sync regen
- [ ] PR 2: cache + memory-write hook + async regen
- [ ] PR 3: Codex + Gemini + Ollama providers + TUI + docs
- [ ] Smoke test: full flow with each provider
- [ ] Fallback behavior verified (no LLM, stale cache, missing CLI)
- [ ] Docs cover config + provider setup + opt-out

## Milestone

Target: **v0.11** (no milestone created yet — to be assigned).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: LLM-summarized wake-up briefing (auto-detect invoker, cheap defaults, configurable) #165

Problem

Proposal

Flow

Auto-detect the invoker AI tool

Configuration (3 levels)

Suggested PR breakdown (3 sub-PRs targeting v0.11)

Non-goals

Open questions

Acceptance criteria

Milestone

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Invoker	Command	Default model
Claude Code	`claude -p --model <m>`	`claude-haiku-4-5`
Codex CLI	`codex exec --model <m>`	cheap codex equivalent (TBD)
Gemini CLI	`gemini -p --model <m>`	`gemini-2.5-flash`
Ollama	HTTP API	`qwen2.5:7b`
API direct	Anthropic/Google/OpenAI SDK	provider-specific cheap default

feat: LLM-summarized wake-up briefing (auto-detect invoker, cheap defaults, configurable) #165

Description

Problem

Proposal

Flow

Auto-detect the invoker AI tool

Configuration (3 levels)

Suggested PR breakdown (3 sub-PRs targeting v0.11)

Non-goals

Open questions

Acceptance criteria

Milestone

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions