Skip to content

Add inbox-digest skill#180

Draft
gnguralnick wants to merge 128 commits into
mainfrom
mngr/crystallize-inbox-digest
Draft

Add inbox-digest skill#180
gnguralnick wants to merge 128 commits into
mainfrom
mngr/crystallize-inbox-digest

Conversation

@gnguralnick

Copy link
Copy Markdown
Contributor

Crystallizes the Gmail inbox-digest pipeline into a reusable skill at .agents/skills/inbox-digest/.

What it does

Turns an unread Gmail inbox into a flat, structured digest — one record per email, classified by type, with the useful information extracted so the original need not be opened.

  • fetch [script] — list unread Gmail for a configurable query, fetch each full message via latchkey curl, decode the body (text/plain preferred, else HTML stripped to text), and persist raw payloads (messages/<id>.json) + raw.json.
  • digest [ai-script] — one keyless claude -p call per email (claude-haiku-4-5), concurrent, classifying into a category and extracting per-category fields; per-email failures are isolated. Writes digest.json.
  • run-all [script] — chains both and refreshes the stable runtime/inbox-digest/digest.json a web surface reads.

Output schema

Categories: newsletter, github, event, action, receipt, networking, promotion — each with its confirmed per-category fields (incl. Ground News bias spread, Onion satire flag). Every record also carries sender_name, body_kind, raw_body, gmail_url, and raw_payload (path to the preserved full Gmail payload) for preserve-and-surface.

Verification

  • Live scenarios (happy-path spread, end-to-end fetch+digest, empty input) and 5/5 fixture-based parser tests pass.
  • Code review applied four robustness fixes; architecture review verdict: ship as-is.

🤖 Generated with Claude Code

Gabriel Guralnick and others added 30 commits May 12, 2026 11:16
Co-authored-by: Sculptor <sculptor@imbue.com>
Collapses mngr create + push + message (and an optional tk ticket) into
a single Python entry point with argparse, a Runner indirection for
unit tests, and pre-flight checks. Replaces the inline bash steps in
launch-task/SKILL.md so every caller (launch-task, crystallize-task,
future skills) shares one driver.

dispatch_test.py covers argv shape, pre-flight failures, ticket
happy/missing/failure paths, and MINDS_WORKSPACE_NAME wiring.

Co-authored-by: Sculptor <sculptor@imbue.com>
Step 4 now calls the generic dispatcher with --ticket-title, so the
ticket is opened in-process (ID written to ticket_id.txt) instead of
needing a separate tk invocation. Step 5 carries the literal poll
one-liner inline. Step 6 points at a new
references/post-crystallize-migration.md walking through consumer
switchovers, runtime-dir cleanup, breaking renames, cached-service
restarts, and ticket close.

Co-authored-by: Sculptor <sculptor@imbue.com>
Problem: in open_ticket, `last_line = ... .splitlines()` bound the
list of lines to a name that reads like a single line, then indexed
`last_line[-1]` to recover the actual last line -- misleading on both
the binding and the index.
Fix: rename to `stdout_lines`. Pure local rename, no behavior change.
Problem: §1's `rg -n "...<slug>" -- '!runtime'` snippets used `--`
followed by `'!runtime'`, which ripgrep parses as a positional path
(literal file/dir named `!runtime`) rather than an exclude glob. Both
snippets exit non-zero with `IO error for operation on !runtime: No
such file or directory`.
Fix: use `-g '!runtime/'`, ripgrep's actual glob-exclusion flag, so
the snippets do what the surrounding prose says they do (search the
tree while skipping the gitignored runtime/ subtree).
Each calling skill (heal-skill, update-skill, crystallize-task) wants its
own ticket title, type, and acceptance criteria. Threading those as
flags through dispatch.py forces the dispatcher to know about a concern
that's purely caller-side. Move the tk lifecycle (create + start at
launch, close on merge) back into the inline bash of each skill.

For crystallize-task specifically, write the ticket ID to
runtime/crystallize/$NAME/ticket_id.txt in Step 2, since the
post-merge-migration step (potentially many turns later) needs to read
it back.

Co-authored-by: Sculptor <sculptor@imbue.com>
Replaces the inline 'mngr create --message-file ... && mngr push ...'
pattern in each skill's launch step with a single dispatch.py call. The
inline pattern was racy: 'mngr create --message-file' delivers the task
to the worker before the runtime dir push lands, so transcript_path /
lead_report_dir resolved against an empty worktree. dispatch.py orders
the lifecycle correctly (create, push runtime dir, message), matching
the rationale already in its module docstring.

This finishes migrating off the racy pattern across all three lead-side
skills that launch crystallize-workers.

Co-authored-by: Sculptor <sculptor@imbue.com>
…ispatch-clean

# Conflicts:
#	.agents/skills/launch-task/SKILL.md
Problem: dispatch_test.py used monkeypatch.setattr(dispatch_mod, "Runner",
_Capture) in the two test_main_* cases to substitute the Runner factory.
That is the exact anti-pattern PREVENT_MONKEYPATCH_SETATTR exists to
discourage -- use dependency injection instead. The two tests also
duplicated an inline _Capture class, and test_mngr_failure_is_fatal had
an unused monkeypatch parameter.

Fix: add an optional `runner` parameter to dispatch.py's main() that
threads through to dispatch(). Both test_main_* cases now pass a
_RecordingRunner directly via that seam, dropping the _Capture stubs
and the monkeypatch.setattr lines. Removed the unused monkeypatch
parameter from test_mngr_failure_is_fatal. Factored shared argv
construction into a _main_argv helper.
Problem: section 2 of post-crystallize-migration.md only addressed
deleting the calling-skill's runtime dir
(`runtime/<calling-skill>/<slug>/`). The crystallize flow's own
artifact dir at `runtime/crystallize/<slug>/` (turn.jsonl, task.md,
reports/, ticket_id.txt) was equally stale post-merge but never
mentioned for cleanup, even though section 5 hinted at it.

Fix: in section 2, note that the crystallize runtime dir is also
stale but must be left alone until section 5 reads ticket_id.txt.
Then have section 6 (commit) remove it as part of the migration
commit, so the cleanup ordering is explicit.
Problem: Runner was defined after push(), forcing a quoted forward
reference ("Runner") that was both stylistically awkward and
redundant under `from __future__ import annotations`. Additionally,
push() took `source_dir: str` while the rest of the module used
`Path`, forcing dispatch() to `str()`-wrap each call.

Fix: Move Runner above push(), drop the quoted forward reference,
change push()'s source_dir parameter type to Path, and pass Path
directly from dispatch() (the str() conversion happens once inside
push() before _normalize_dir).
Problem: Step 3 of do-something-new/SKILL.md referenced "Step 6's
crystallize-task handoff", but crystallize-task is invoked in Step 7
(Step 6 is "Deliver remaining surfaces one at a time").
Fix: Update the reference to "Step 7's crystallize-task handoff" so
the cross-reference matches the actual section structure.
Problem: heal-skill/SKILL.md line 46 still described the task-file/turn.jsonl
co-location as relying on "the existing Step 4 `mngr push`" -- but Step 4
no longer runs a raw `mngr push`; it invokes the shared launch-task
dispatch.py (which performs the push internally). The analogous comment
in crystallize-task/SKILL.md was already updated to "the Step 4 push"
in 2dc0e3e; heal-skill was missed in that sweep.

Fix: rephrase to "the Step 4 push" so the inline doc matches the
implementation that follows it. Pure prose edit.
Problem: dispatch_test.py lines 44 and 53 carried `# type: ignore[name-defined]`
and `# type: ignore[override]` comments, suppressing diagnostics that
pyright never raises. The user CLAUDE.md prohibits suppressing
type/lint warnings as a first resort: "Only use suppression when the
code is structurally correct but the tooling can't express it." Here
the tooling expresses nothing -- because dispatch_mod is loaded via
importlib, pyright treats the parent class as Unknown and silently
permits both the subclass and the override -- so the ignore keys are
dead and risk masking real future drift.

Fix: remove both `# type: ignore` trailing comments. Verified with
`uv run pyright` (0 errors), `uv run ruff check` (clean), and the
8-test dispatch_test.py suite (all pass).
…ctly

The crystallize-task / heal-skill / update-skill flows previously slurped a
specific turn out of the lead's Claude Code JSONL via `extract_turn.py`,
pushed `turn.jsonl` over to the worker, and the worker read the slice from
its task-file frontmatter's `transcript_path`.

This was rigid: real lead work often spans multiple turns, not just one,
and the worker had no way to widen the window. Replace it with a
description-plus-anchors handoff:

- Lead writes a prose summary of the work in the task body and includes
  1-3 verbatim quotes ("anchors") from the conversation to ground the
  worker's search.
- Worker uses `mngr transcript $LEAD_AGENT` (with --role user --role
  assistant, then full tool detail once scoped) to locate the relevant
  turns. Each worker SKILL.md notes that the lifecycle-skill invocation
  is the most recent turn and the work is *prior* to it.
- Drop the `transcript_path` field from task-file frontmatter; only
  `lead_agent` and `lead_report_dir` remain.
- Delete `extract_turn.py` + its tests. Trim `transcript_parsing.py`
  down to `iter_transcript` / `is_user_tool_result_carrier` (the only
  helpers the stop hook still imports).

The verify flow keeps its `commit.diff` / `commit.log` artifacts as
ground truth for the diff, but its task body now carries anchor quotes
too in case the rationale alludes to conversational context.

Co-authored-by: Sculptor <sculptor@imbue.com>
Problem: Step 1 in .agents/skills/update-skill/SKILL.md still listed
'incident captured' in the tk-create acceptance string. That clause
referred to the extract_turn.py / turn.jsonl capture step that this
branch removed; the two sibling skills (crystallize-task, heal-skill)
already dropped it.
Fix: Remove the 'incident captured;' prefix so the acceptance string
matches the sibling skills and the actual flow.
…ript flow

Problem: The worker SKILL.md frontmatter description and intro said the
task file "points at a replay transcript on disk", but this branch
removed the on-disk replay artifact: the worker now explores the lead's
transcript via `mngr transcript` (per the same SKILL.md's Stage 1
instructions).
Fix: Reword the description and intro to describe the new mechanism
(verbatim quote anchors plus mngr transcript), so the orientation prose
matches the implementation immediately below it.
Problem: The worker SKILL.md frontmatter description said the task
file asks the worker to heal "by pointing at its incident transcript",
but this branch removed the on-disk incident transcript: the worker
now reads the lead's transcript via `mngr transcript` (per Stage 1).
Fix: Reword the description to describe the new mechanism (incident
description + verbatim quote anchors), so the orientation matches
the implementation.
Problem: The absorb-flow bullet in update-skill SKILL.md said the lead
"hand[s] the worker the incident transcript", but this branch removed
the on-disk transcript handoff: the lead now hands an incident summary
with verbatim quote anchors, and the worker reads the lead's transcript
via `mngr transcript` (per references/lead-absorb.md and
references/worker-absorb.md).
Fix: Reword the bullet to describe the new mechanism so the high-level
overview matches the per-flow references.
Problem: parse_task_frontmatter.py:92 in `parse()` says "Return the
three required fields", but transcript_path was dropped earlier on
this branch and only two fields (lead_agent, lead_report_dir) remain.

Fix: change "three" to "two" so the function-level docstring matches
the module-level docstring (already says "two shell-evalable lines")
and the actual _REQUIRED_FIELDS tuple.
So the worker's first `mngr transcript <lead>` read includes every turn
through the handoff moment. The converter normally polls on a 5s
interval, racing with worker startup.

No-op when MNGR_AGENT_STATE_DIR is unset or the converter script isn't
installed at the standard path (non-claude agents).
The required fields still validate strictly, but additional top-level
string keys now flow through to the worker as KEY=value lines. Leads
can attach flow-specific context (ticket id, feature flag, inputs
manifest) without each new key needing a parser change.

Non-string values (lists, mappings, numbers, bools) are still dropped
since only strings round-trip cleanly through `eval`.
- Drop the flow-by-flow enumeration of what's staged alongside task.md;
  each worker's own SKILL.md is authoritative for its flow's inputs.
- Drop the verbose output-shape explanation; the eval example shows it.
- Drop the redundant frontmatter-schema block; parse_task_frontmatter.py
  docstring is canonical.
- Note that extra string fields the lead sets pass through as shell
  vars (matches the parser change).
Problem: parse_task_frontmatter.py at .agents/shared/scripts/ now passes
through arbitrary string frontmatter keys. _render() emits each as
KEY=value for downstream `eval`, but never validates the key is a valid
POSIX shell identifier. A key like `staged-inputs` (plausible given the
docstring invites "a list of staged inputs") would render as
`STAGED-INPUTS=value`, which bash treats as a command lookup -- the
variable is silently never set and the worker proceeds without an
expected field.

Fix: in parse(), check every extra key against
^[A-Za-z_][A-Za-z0-9_]*$ and raise ValueError with a clear message
naming the offending key and suggesting snake_case. Required fields
(lead_agent, lead_report_dir) conform by construction so the happy
path is unchanged. Added two regression tests (dash and leading-digit
cases) and updated the module docstring to call out the constraint.
Problem: _flush_common_transcript() in
.agents/skills/launch-task/scripts/dispatch.py invoked the lead's
common_transcript.sh converter with check=True. Any non-zero exit
(the script uses `set -euo pipefail`, and the converter touches log
files / runs subprocess pipelines that can fail transiently) would
abort dispatch *after* `mngr create` and the runtime push had already
succeeded, but *before* `mngr message`. That left the worker orphaned
in a half-launched state requiring manual cleanup.

Fix: switch to check=False, inspect the returncode, and on non-zero
exit print a stderr warning explaining the worker will read whatever
the periodic 5s poller has already produced. The flush is documented
in the module docstring as a freshness optimization that merely races
the poller, so the stale-by-a-few-seconds fallback is correct; losing
the entire dispatch is not. Added a regression test that simulates a
converter exit of 2 and asserts dispatch still completes with the
message send running and a warning landing on stderr.
The Stop-hook crystallization detector now reads
$MNGR_AGENT_STATE_DIR/events/claude/common_transcript/events.jsonl
(the agent-agnostic event log mngr_claude maintains) instead of the
raw Claude transcript path the hook payload provides.

Freshness: invoke $MNGR_AGENT_STATE_DIR/commands/common_transcript.sh
--single-pass synchronously at hook entry so events.jsonl catches up
with the just-finished turn before we read. Without this the 5s poller
races worker startup.

Format shifts handled:
- assistant content[].tool_use blocks -> assistant_message.tool_calls[]
  with tool_call_id / tool_name / input_preview (JSON-encoded, truncated
  to 200 chars).
- user content[].tool_result blocks -> standalone tool_result events.
- isMeta user events -> tool_result with tool_name="meta"; boundary
  detection now treats both user_message and tool_result(meta) as turn
  boundaries.
- Bash command detection parses the input_preview JSON and checks the
  command field exactly (avoiding false positives from a "git commit"
  mention in the description). Falls back to substring matching when
  the preview is truncated past JSON-parseability.
- Skill input.skill is read from input_preview JSON; short Skill calls
  always fit in 200 chars.

Nudge state file now keys by transcript_id (the agent's state_dir
path) instead of the raw transcript path. Old state files become
inert -- they re-arm one extra nudge after upgrade and then stabilize.

transcript_parsing.py had only two helpers (iter_transcript and
is_user_tool_result_carrier), both raw-Claude-format specific and
both only consumed by this hook. Deleted along with its test file.
Gabriel Guralnick and others added 30 commits June 9, 2026 17:54
… gabriel/ai-driven-updates

# Conflicts:
#	.agents/skills/build-web-service/SKILL.md
Problem: In AgentManager.update_session_events, the comment above the
_last_event_timestamp_by_agent assignment claimed the refresh was "kept out
of the short-circuit above" -- the opposite of reality. The assignment sits
after the early-return guard, so it is subject to that short-circuit and is
skipped when pending/type are unchanged.
Fix: Rewrote the comment to accurately state that the refresh shares the same
short-circuit (no-op streamed events return early, skipping the refresh and
the recompute). Comment-only change; no behavior change.

Co-authored-by: Sculptor <sculptor@imbue.com>
…nical

main superseded this branch's AI-driven-services approach: where the branch
built a libs/ai_integration workspace library (backends/core/credentials/
pricing/spend) behind the use-ai-integration skill, main rebuilt the skill
around a copyable, self-contained claude_p.py helper (keyless claude -p) plus
a keyed litellm path, with no shared library.

Treat main's version as canonical:
- Take main's use-ai-integration SKILL.md, billing-and-credentialing.md, and
  the claude_p.py helper + test; drop the branch's patterns.md.
- Remove libs/ai_integration entirely (no production consumer) and its build
  wiring: the ai-integration workspace member/dependency/source in
  pyproject.toml and uv.lock; keep main's litellm dependency.
- Revert the edit-services AI-spend-ceiling section (library-specific).
- Drop the obsolete blueprint/use-ai-integration v1 plan; main's
  blueprint/ai-driven-services plan is canonical.
- Retarget the skill-authoring methodology (crystallize-task, update-skill,
  heal-skill, build-web-service, spec-summary, when-to-crystallize,
  update-vs-create-new) from "script model-judgement steps via the
  ai_integration library + a # workspace-script: marker" to "copy claude_p.py
  into the skill's scripts/ and call it from a self-contained PEP 723 run.py";
  revert validate_skill.py + test to the PEP-723-only check.

Worker-launch script (create_worker.py) and agent_manager.py converged on both
sides; took main's create_worker.py/test and kept the branch's comment fix.
…py per the skill

Scripting a model-judgement step should not default to the keyless claude_p.py
helper. A non-agentic step with an API key set should call litellm directly
(cheaper), and the cost / keyed-savings should be surfaced to the user, exactly
as the use-ai-integration skill prescribes.

- Add .agents/shared/references/calling-claude.md as the single source for
  one-shot model-call path selection (keyed non-agentic -> litellm; keyless ->
  claude_p_completion; agentic -> claude_p_task) plus cost surfacing
  (cost_usd / usage, completion_cost, the cost_per_token keyed-savings estimate,
  measure-a-small-sample).
- Trim use-ai-integration/SKILL.md to defer scenarios 1-2 and the cost model to
  calling-claude.md, keeping its scenario framing and the full-agent (launch-task)
  scenario. Removes the duplicated inline mechanics.
- Point spec-summary's "Scripting a model step" and the lifecycle skills
  (crystallize-task, update-skill, heal-skill, when-to-crystallize,
  update-vs-create-new) at calling-claude.md, and make their inline mentions
  path-agnostic ([ai-script] / "model call") instead of naming claude_p.py.

Progressive disclosure: each consumer loads only what it needs and the path /
cost guidance lives in exactly one place.
…tion skill directly

Reverts the calling-claude.md extraction. The extracted reference did not earn
its keep: the only consumers (the skill-authoring docs and build-web-service)
can just read the use-ai-integration skill, which is already the authority on
calling Claude. Extracting it had made that skill non-self-contained (it
deferred its own core content to an external file) for a marginal
progressive-disclosure gain.

- Remove .agents/shared/references/calling-claude.md.
- Restore use-ai-integration/SKILL.md to its self-contained form (scenarios 1-2
  and the cost model inlined again), but generalize the framing: it is no longer
  written specifically for "a service" -- it now reads as the shared reference
  for calling Claude from code (service, skill [ai-script] step, or any
  integration), with the referencing context supplying the framing.
- Point spec-summary's "Scripting a model step" and the crystallize-task worker
  at the use-ai-integration skill instead of the removed file.

The litellm-first-when-keyed guidance and cost surfacing are preserved -- they
live in the (now self-contained) use-ai-integration skill.
The between-turns auto-merge of origin/main bumped the vendored mngr libs
(imbue-common 0.1.19->0.1.20, imbue-mngr/imbue-mngr-claude 0.2.12->0.2.14),
which now pull anthropic and docstring-parser as transitive deps. No
intentional first-party dependency change.
…se] vs [ai-script]

Sharpen the canonical step-kind reference (spec-summary.md) with an
operational test for when a model step stays [ai-script] vs drops to
[prose], the concrete categories that justify prose, and the
push-prose-to-the-edges rule. Have the crystallize and update-skill
worker prose-justification bullets reference the same test.

Pending user confirmation on category granularity and whether to add a
validate_skill.py enforcement check.
…not transcript access

The earlier framing claimed a step must be [prose] when it 'needs the live
conversation.' That's wrong: a script can fetch the transcript (or the
relevant slice) and pass it into the prompt -- the crystallize/update workers
already run that way. Needing the conversation, like needing judgement, does
not justify prose.

Reframe around the skill's execution mode: a step is [prose] only when the
skill needs the user in the loop while it runs -- for live input/approval, or
because they invoke it interactively to follow along and steer. Otherwise
every step is scriptable and there is no reason to run it interactively.
Update spec-summary.md, the two worker prose-justification bullets, and the
crystallize re-run-test doc accordingly.
…e test

Make spec-summary.md the single canonical statement of the [ai-script] vs
[prose] / execution-mode test, and have the dependent docs defer to it
instead of re-deriving the same test. Net: principles stated once, no
content dropped.
…n.py shape

Decompose run.py into a subcommand per cleanly-separable step plus a 'run
all' that chains them in-process. One scripted implementation then serves
both execution modes: an agent running the skill in chat drives the steps
one at a time (mirrored as tk steps) for a rich progress view, while
headless/scheduled runs call 'run all'. Steps that hand a live handle to the
next stay inlined. The per-step split also leaves inspectable intermediate
artifacts at each boundary.

Flip the prior 'single entry point, inline unless an invariant demands
splitting' default across the lifecycle skills, and add a runtime note in
CLAUDE.md (not per-SKILL.md) telling agents to drive subcommands step by
step in chat, pausing only at declared [prose] steps.
Co-authored-by: Sculptor <sculptor@imbue.com>
Skill validation now does a runnability pass: when the static checks pass
and a scripts/run.py exists, it runs `uv run scripts/run.py --help`, which
forces uv to resolve the script's PEP 723 dependencies and import the module.
A broken import or unresolvable dependency now fails validation instead of
only surfacing at scenario time. This is a shallow import check (--help only
exercises top-level imports and argparse wiring); deeper paths remain covered
by scenario testing.

The subprocess runner is injectable so the result-handling is unit-testable
without a real uv run; added tests cover clean scripts, broken imports,
non-zero exits, timeouts, and a missing uv. Updated the spec-summary and the
crystallize-task/update-skill worker instructions to describe the new check.
Problem: test_check_runnable_good_script and test_check_runnable_broken_import
in .agents/shared/scripts/validate_skill_test.py invoke a real
`uv run scripts/run.py --help`, which resolves a PEP 723 environment in a
subprocess. The root pytest config applies a global 10s per-test timeout, but a
cold uv cache (e.g. a fresh CI runner) can take far longer to resolve -- the
validator itself allows 180s -- so these two tests could time out and flake in
CI.
Fix: mark both real-uv tests with @pytest.mark.timeout(180) via a shared
named constant, matching the validator's own cold-cache allowance, so they get
a realistic per-test budget while the rest of the suite keeps the fast 10s
timeout.
Crystallizes the Gmail inbox-digest pipeline: fetch unread mail (query
parameterized), classify each email by type via a scripted AI call, and
extract per-category fields into a flat structured digest matching the
confirmed schema. Preserves raw payloads + gmail_url and refreshes a stable
runtime/inbox-digest/digest.json for the web surface.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem: _classify_one let any per-email exception (a claude -p error, or
output that could not be parsed/classified) propagate out of the anyio task
group, which cancelled every sibling task and aborted the whole digest/run-all
on a single malformed email. This also made the surrounding partial-success
machinery unreachable -- the pre-sized results list, the finished filter, and
the 'Digested M/N emails' line only make sense if individual emails can be
skipped while the rest succeed.

Fix: catch the pipeline's expected failure types (ClaudeCLIError,
InboxDigestError) inside _classify_one, leaving that email's result slot None
so the digest is built from the emails that did succeed, and report the skipped
email's id/subject and reason to stderr instead of dropping it silently. The
except is scoped to the known failure types so a genuine programming error
still surfaces.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem: _load_records in scripts/run.py is annotated to return
list[dict[str, object]] but only checked that the top level was a list, not
that each element was a dict. A raw.json whose elements were non-objects
would pass and later crash with AttributeError in the digest step instead of
the module's intended clean InboxDigestError.

Fix: validate that every list element is a dict, raising InboxDigestError
naming the offending index, so the runtime value matches the declared return
type and malformed input is reported cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem: _load_records in scripts/run.py read raw.json with bare
json.loads(path.read_text()), so a missing file (FileNotFoundError) or
malformed JSON (json.JSONDecodeError) raised exceptions that are not
InboxDigestError. main() only catches InboxDigestError, so the digest and
run-all subcommands dumped a raw traceback instead of the intended
'error: ...' message and exit code 1.

Fix: wrap the read+parse, converting OSError and ValueError into
InboxDigestError with a message naming the path, matching how the rest of
the module funnels expected failures. Now both cases print a clean error
and exit 1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem: run.py serialized JSON with ensure_ascii=False (keeping raw non-ASCII
characters that are common in real email) but used Path.write_text/read_text
without an encoding, so the locale's preferred encoding was used. Under a
non-UTF-8 locale with Python's UTF-8 mode disabled (e.g. cron/headless jobs,
where run-all is the documented entrypoint) this raises UnicodeEncodeError on
write and UnicodeDecodeError on read.
Fix: pass encoding="utf-8" to all five write_text/read_text calls (raw
payloads, raw.json, digest.json, the stable copy, and the raw.json read),
matching the encoding implied by ensure_ascii=False and the codebase convention
in other skill scripts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add body_kind and a raw_payload path to every digest record so a consumer can
reach the original Gmail payload (and render HTML emails in native format)
rather than relying only on the HTML-stripped raw_body and the gmail_url link.
Strengthens the preserve-and-surface contract per the architecture review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The scaffolded services.toml command set ROOT_PATH before
`python3 scripts/forward_port.py`, but a `VAR=val cmd1 && cmd2` prefix
binds VAR to cmd1 only -- so the app (`uv run <name>`) ran with ROOT_PATH
unset and FastAPI's root_path was empty. Move the assignment onto the app.

Also document the client-side half: every URL the app's HTML/JS emits
must be relative, never an absolute path (which escapes to the workspace
shell) or a hardcoded prefix (which breaks standalone / on rename).
Correct the previously-false gotchas claim that the scaffolded happy path
emits prefix-correct URLs without further work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem: the new "Client-side URLs" section in cross-flow-gotchas.md
asserts WebSocket URLs must be relative and never a hardcoded prefix, and
cross-references the WebSocket section as "one instance of this same
rule" -- but that section's example hardcoded the prefix
(`/service/<name>/socket`), contradicting the rule it was cited as
illustrating.

Fix: lead the WebSocket section with the relative form
(`new WebSocket("socket")`), which modern browsers resolve against the
injected `<base href>` with automatic ws/wss scheme upgrade, matching the
new rule. Keep the absolute-from-`location` construction as an explicit
older-browser fallback rather than the default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tighten the Step 2 URL rule, the gotchas client-side/server-side sections,
and the generated runner docstring; remove the unsupported "single most
common prefix bug" assertion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A FastAPI service at /service/inbox-digest/ that renders the unread
inbox grouped by type (newsletters -> articles, GitHub, actions, events,
receipts, networking, promotions collapsed), with a per-item 'view raw'
(original email rendered in a sandboxed iframe, scripts/remote pixels
blocked) and 'open in Gmail'. Seeded from the confirmed sample; reads
runtime/inbox-digest/digest.json.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant