feat(run): image attachment passthrough with provenance#1811
Conversation
Adds the operator-facing --attach surface and the spawn-time
provenance contract: each attached image is anchored to the run via
an HMAC-chained audit event, the content-addressed blob store, and
the worker's lineage v1 receipt parents.
Changes:
- Add Task.attachments list field and plumb it through Task.from_dict.
- New module src/bernstein/core/agents/multimodal_attestation.py:
build_attachment_context() reads paths, stores bytes in CAS,
records the audit event, and returns a MultiModalContext.
- New module src/bernstein/core/security/audit_chain.py: AuditChainStore
facade over AuditLog plus the additive multimodal.attach event type
and the record_multimodal_attach() helper.
- Additive helpers in core/persistence/lineage_signer.py for the
attachment-as-parent URI scheme.
- CLI: --attach option on bernstein run, repeatable, validated path,
with capability gating before any process is launched.
- Adapters: Claude and Gemini accept multimodal_context= and inline
base64-encoded attachments with the documented wire format.
- YAML plan loader honours an attachments: list on each step.
- Worktree pinning enforced at resolve time; cross-worktree attempts
raise WorktreeAccessDenied.
- Documentation in docs/operations/run.md.
- Tests:
- tests/unit/test_multimodal_attestation.py: 23 cases covering
Task model field, capability gating, sha256 stability, audit
record shape, lineage parents, worktree isolation, replay,
tamper detection, chain continuity, and YAML plan loader.
- tests/integration/test_run_attach.py: end-to-end stub-adapter
round-trip plus CLI option validation.
Closes #1797
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR implements operator-driven multimodal image attachment passthrough with content-addressed storage, HMAC-chained audit recording, and worktree-pinned access control. Operators invoke ChangesMultimodal Attachment Passthrough with Provenance
🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Sorry @chernistry, you have reached your weekly rate limit of 2500000 diff characters.
Please try again later or upgrade to continue using Sourcery
Sonar insights (advisory, no merge-block)Snapshot of
Run This comment is a soft signal. The Sonar scan runs on push to |
Review-bot acknowledgement summary
All must-address findings are resolved or acknowledged. |
|
bernstein doctor observe for PR #1811 ( sonar -- WARN (project bernstein)
code-scanning -- WARN (38 open alert(s))
Skipped backends (credentials not configured)
See docs/observability/unified-doctor.md for backend setup notes. |
Auto-applied by contract-drift-autofix.yml on PR #1811. Regenerated via scripts/regen_contract_drift.py. Refs #1273. Source CI run: https://github.com/sipyourdrink-ltd/bernstein/actions/runs/26251274378
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/bernstein/adapters/gemini.py (1)
214-243:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDo not route attachment blobs through argv.
In
src/bernstein/adapters/gemini.pyLines 221-243, attachment base64 is injected intopromptand then passed asbinary -p <prompt>. That makes multimodal runs vulnerable to OS command-line size limits, so a handful of medium images can fail before the CLI even launches.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/bernstein/adapters/gemini.py` around lines 214 - 243, The current implementation injects base64 attachment blobs into prompt via _inject_multimodal_attachments and then passes that huge string in cmd (binary -p prompt), which risks OS argv size limits; instead, remove embedding binary data into prompt in the method that builds cmd (stop calling _inject_multimodal_attachments into prompt) and persist multimodal_context to a temporary file under the existing workdir (e.g., log_path.parent) or stream it via stdin, then modify the command built in resolve_google_cli_binary usage to pass the attachment file/stream reference (for example a --attachments-file or use stdin) rather than inline base64; update any callers that expect injected text and ensure the temp file is securely created and cleaned up after SpawnResult completes.src/bernstein/adapters/claude.py (1)
380-381:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDo not send attachment payloads through
-p.In
src/bernstein/adapters/claude.pyLines 564-570, base64 attachments are prepended toprompt, and Lines 380-381 still pass the full blob as one argv argument. A few screenshots is enough to hit OS command-line size limits and fail spawn withE2BIGbefore Claude starts.Also applies to: 549-570
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/bernstein/adapters/claude.py` around lines 380 - 381, The code currently prepends base64 attachment blobs into the prompt and then calls cmd.extend(["-p", prompt]) (in src/bernstein/adapters/claude.py, around the prompt/attachment handling and the cmd construction), which can hit OS argv size limits; change the flow so attachments are not passed inline in the argv: for each attachment detected in the prompt-building code (the block that prepends base64 around lines 549-570) write the decoded payload to a temporary file (e.g., tempfile.NamedTemporaryFile) and remove the blob from the in-memory prompt, then update the CLI invocation built with cmd (the place using cmd.extend(["-p", prompt])) to instead pass a small prompt string and pass attachment file paths via a separate flag or a stable convention (e.g., "-a", file_path for each temp file) or feed the prompt via stdin (use '-' and pipe the prompt) so you never pass large base64 data as a single argv entry.docs/operations/run.md (1)
1-101:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd required operational sections for the new
--attachworkflow.Lines 11-92 document behavior but omit explicit environment-variable references, health-check steps, and troubleshooting procedures for operators.
Suggested structure
+## Environment variables +- `BERNSTEIN_RUN_ATTACHMENTS` (internal propagation for spawned worker context) + +## Health checks +```sh +bernstein run --help +bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run +``` + +## Troubleshooting +- Capability refusal when adapter is not multimodal-capable. +- Cross-worktree resolution denial (`WorktreeAccessDenied`) and recovery steps. +- Audit-chain tamper verification failure handling steps.As per coding guidelines: "docs/operations/**/*.md: In deployment and operational documentation, include complete configuration examples, environment variable references, health check procedures, and troubleshooting guides for each deployment scenario".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/operations/run.md` around lines 1 - 101, The doc is missing operational details for the new --attach workflow; add explicit environment-variable references (e.g., variables controlling CAS path and audit signing), health-check steps (e.g., test commands: bernstein run --help and a dry-run like bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run), and a Troubleshooting section covering capability refusals, WorktreeAccessDenied recovery, and audit-chain tamper verification handling. In the same file update: reference the concrete symbols and paths used by the implementation (MultiModalContext, multimodal.attach event, multimodal_attestation.py resolver, register_attachment_parents in lineage_signer.py and the AuditChainStore) and include short recovery commands or checks operators can run to validate CAS entries, signature verification, and cross-worktree access. Ensure examples show both CLI and task YAML usage and mention relevant env vars that affect CAS/audit behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/operations/run.md`:
- Around line 24-30: Update the "Capable adapters" section to show operator
commands for verifying local adapter capabilities instead of only listing static
names: mention using `bernstein adapters list` to quickly enumerate available
adapters and `bernstein adapters check` to perform full conformance inventory
(binary path, version, capabilities, contract validation), and update the
explanatory text so that users are instructed to run those commands before using
`--attach` to confirm which adapters (e.g., claude, gemini) actually support
attachments.
In `@src/bernstein/adapters/claude.py`:
- Around line 71-112: The _inject_multimodal_attachments function currently
computes sha256 by rereading content_path which can diverge from the actual
bytes sent (content_base64); change it so the digest is computed from the exact
bytes used in the payload: if inp.content_base64 is present, decode that base64
to bytes and hash those bytes (falling back to reading Path(content_path) only
when content_base64 is absent), handle base64 decoding errors by setting digest
to "" (or logging) and ensure the mime_type and the produced b64 payload remain
unchanged; update references to inputs, content_base64, content_path, and
mime_type in the function to reflect this sourcing order so the sha256 always
matches the inline attachment bytes.
In `@src/bernstein/adapters/gemini.py`:
- Around line 159-191: _inject_multimodal_attachments is computing sha256 from
the filesystem via content_path rather than from the actual transmitted bytes in
content_base64, which can create a mismatch; change the logic in
_inject_multimodal_attachments to derive the digest from the decoded base64
payload (use base64.b64decode on b64) when content_base64 is present, falling
back to reading content_path only if content_base64 is empty, and ensure
exceptions during decode/read set digest to "" and do not crash so blocks.append
still uses the correct mime and computed sha256.
In `@src/bernstein/core/agents/multimodal_attestation.py`:
- Line 39: The attestation currently re-reads files via inp.content_path when
saving hashes/CAS, which can diverge from the attachment bytes already produced
in build_multimodal_context(paths); change build_multimodal_context to capture
and return the exact bytes used for each attachment (the payloads passed to
multimodal.attach) and use those same in-memory bytes when computing hashes and
storing to CAS and when creating multimodal.attach entries (avoid any subsequent
file re-reads of inp.content_path); update the code paths that reference
inp.content_path (and the multimodal.attach/CAS calls) to accept the bytes
returned from build_multimodal_context so the audit chain records and stores the
identical bytes the model actually received.
- Around line 325-337: The resolver is currently matching events by sha256 only
and then taking the last event across all worktrees; change the lookup to
resolve by the tuple (sha256, worktree_id) instead: when querying audit_chain
(audit_chain.query(event_type=EVENT_MULTIMODAL_ATTACH)) filter events where both
e.details.get("sha256") == sha256 and e.details.get("worktree_id") ==
requesting_worktree_id, raise FileNotFoundError if no such per-worktree matches
exist, then pick the most recent event from that filtered list and proceed with
the existing attached_worktree check and potential WorktreeAccessDenied raise;
update references to matches, attached_worktree, and the error conditions
accordingly.
In `@src/bernstein/core/persistence/lineage_signer.py`:
- Around line 181-193: The build_attachment_parent_uri function currently only
checks sha256 length; update it to also validate that sha256 is a valid
lowercase hex string (i.e., only 0-9 and a-f) before returning
f"{_ATTACHMENT_PARENT_SCHEME}{sha256}" and raise LineageSignerError if
validation fails; locate the function build_attachment_parent_uri and add a
hex-check (e.g., regex or bytes.fromhex validation) against the input and ensure
the error message still reports the bad value/length when raising
LineageSignerError.
In `@src/bernstein/core/planning/plan_loader.py`:
- Around line 254-257: The code currently assumes step.get("attachments") is a
list and does list comprehension over it (attachments: list[str] = [str(a) for a
in (step.get("attachments") or [])]), which will iterate a scalar string
character-by-character; update the logic in plan_loader.py to validate the value
returned by step.get("attachments") before iterating: if it's None treat as
empty list, if it's a list map each element to str, and if it's any other type
raise a clear parsing error (or explicitly reject/non-list with a
TypeError/ValueError). Apply the same validation pattern where attachments is
computed elsewhere (the other occurrence around the function/section at the
later occurrence) so non-list YAML values are rejected at parse time rather than
iterated.
In `@src/bernstein/core/tasks/models.py`:
- Line 512: The list comprehension attachments=[str(a) for a in
raw.get("attachments", [])] will treat a string like "diagram.png" as an
iterable of characters; instead, normalize raw.get("attachments") into a proper
list first (e.g., assign attachments_raw = raw.get("attachments", []) and if
isinstance(attachments_raw, str) then wrap it: attachments_list =
[attachments_raw]; if attachments_raw is None set to []; otherwise ensure it's
an iterable/list) and then use attachments=[str(a) for a in attachments_list];
update the code around the attachments assignment to perform this normalization
so string-valued attachments are handled safely.
---
Outside diff comments:
In `@docs/operations/run.md`:
- Around line 1-101: The doc is missing operational details for the new --attach
workflow; add explicit environment-variable references (e.g., variables
controlling CAS path and audit signing), health-check steps (e.g., test
commands: bernstein run --help and a dry-run like bernstein run --goal "probe"
--attach ./screenshot.png --cli claude --dry-run), and a Troubleshooting section
covering capability refusals, WorktreeAccessDenied recovery, and audit-chain
tamper verification handling. In the same file update: reference the concrete
symbols and paths used by the implementation (MultiModalContext,
multimodal.attach event, multimodal_attestation.py resolver,
register_attachment_parents in lineage_signer.py and the AuditChainStore) and
include short recovery commands or checks operators can run to validate CAS
entries, signature verification, and cross-worktree access. Ensure examples show
both CLI and task YAML usage and mention relevant env vars that affect CAS/audit
behavior.
In `@src/bernstein/adapters/claude.py`:
- Around line 380-381: The code currently prepends base64 attachment blobs into
the prompt and then calls cmd.extend(["-p", prompt]) (in
src/bernstein/adapters/claude.py, around the prompt/attachment handling and the
cmd construction), which can hit OS argv size limits; change the flow so
attachments are not passed inline in the argv: for each attachment detected in
the prompt-building code (the block that prepends base64 around lines 549-570)
write the decoded payload to a temporary file (e.g.,
tempfile.NamedTemporaryFile) and remove the blob from the in-memory prompt, then
update the CLI invocation built with cmd (the place using cmd.extend(["-p",
prompt])) to instead pass a small prompt string and pass attachment file paths
via a separate flag or a stable convention (e.g., "-a", file_path for each temp
file) or feed the prompt via stdin (use '-' and pipe the prompt) so you never
pass large base64 data as a single argv entry.
In `@src/bernstein/adapters/gemini.py`:
- Around line 214-243: The current implementation injects base64 attachment
blobs into prompt via _inject_multimodal_attachments and then passes that huge
string in cmd (binary -p prompt), which risks OS argv size limits; instead,
remove embedding binary data into prompt in the method that builds cmd (stop
calling _inject_multimodal_attachments into prompt) and persist
multimodal_context to a temporary file under the existing workdir (e.g.,
log_path.parent) or stream it via stdin, then modify the command built in
resolve_google_cli_binary usage to pass the attachment file/stream reference
(for example a --attachments-file or use stdin) rather than inline base64;
update any callers that expect injected text and ensure the temp file is
securely created and cleaned up after SpawnResult completes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5d277667-c254-40ce-8673-a14eb068ef6c
📒 Files selected for processing (13)
docs/operations/run.mdsrc/bernstein/adapters/base.pysrc/bernstein/adapters/claude.pysrc/bernstein/adapters/gemini.pysrc/bernstein/cli/main.pysrc/bernstein/cli/run_bootstrap.pysrc/bernstein/core/agents/multimodal_attestation.pysrc/bernstein/core/persistence/lineage_signer.pysrc/bernstein/core/planning/plan_loader.pysrc/bernstein/core/security/audit_chain.pysrc/bernstein/core/tasks/models.pytests/integration/test_run_attach.pytests/unit/test_multimodal_attestation.py
Round-trip and audit fidelity: - multimodal_attestation: hash + store the base64-decoded bytes from content_base64 (the bytes that actually travel to the model API) instead of re-reading content_path. Eliminates the race where the on-disk file changes between encode and attest time and the audit record no longer matches the inlined bytes. - claude / gemini _inject_multimodal_attachments: compute the announced sha256 from the decoded base64 payload, not from a fresh filesystem read. Resolver correctness: - resolve_attachment_for_worker filters chain events by (sha256, worktree_id) before picking, so the same bytes attached in wt-a AND wt-b both resolve in their own worktree. Previously a later attach in wt-b could shadow a valid wt-a resolve. Input validation: - build_attachment_parent_uri rejects non-hex strings (e.g. 'x'*64) with a structured LineageSignerError. - plan_loader rejects scalar `attachments` (e.g. 'attachments: shot.png') with a PlanLoadError instead of iterating the string character by character. - Task.from_dict raises TypeError on scalar attachments payloads via a new _normalize_attachments helper, with a clear hint. Concurrency: - AuditChainStore.log_with_prev_digest holds a threading.Lock around the (read prev_chain_digest -> append) pair so two concurrent attaches always embed distinct predecessors and the on-disk chain stays linear. Docs: - docs/operations/run.md adds explicit verification commands (bernstein adapters list/check, bernstein doctor) and a pointer to the canonical capability function. New tests: - Hash-matches-base64 invariant after on-disk mutation. - Cross-worktree dual-attach: both worktrees resolve independently. - Non-hex / uppercase digest rejection. - Atomic prev_chain_digest under 16-thread concurrent appends. - Task.from_dict scalar attachments rejected. - plan_loader scalar attachments rejected. bot-ack: 3284182740 bot-ack: 3284182744 bot-ack: 3284182752 bot-ack: 3284182756 bot-ack: 3284182761 bot-ack: 3284182781 bot-ack: 3284182784 bot-ack: 3284182792 bot-ack: 3284182800
| A new list with attachment parents appended. | ||
| """ | ||
| seen: set[str] = set(parents) | ||
| out: list[str] = list(parents) |
| (bot-ack: 3284182792 -- CodeRabbit major.) | ||
| """ | ||
| with self._append_lock: | ||
| merged: dict[str, Any] = dict(details) |
Summary
--attach <path>(repeatable) tobernstein run. Capability gate refuses non-multimodal adapters BEFORE spawning.multimodal.attachevent (sha256, mime, install_id_sig, worker_id, turn_seq, prev_chain_digest, worktree_id); bytes stored once in CAS; lineage v1 receipt parents carry the attachment digest.Closes #1797.
Files touched
src/bernstein/core/agents/multimodal_attestation.pysrc/bernstein/core/security/audit_chain.pytests/unit/test_multimodal_attestation.py(23 cases)tests/integration/test_run_attach.py(round-trip + CLI flag)docs/operations/run.mdsrc/bernstein/core/tasks/models.py(Task.attachments field + from_dict)src/bernstein/core/planning/plan_loader.py(YAML attachments key)src/bernstein/cli/run_bootstrap.py(--attach option + capability gate)src/bernstein/adapters/base.py(multimodal_context= keyword on spawn)src/bernstein/adapters/claude.py(base64 wire format)src/bernstein/adapters/gemini.py(base64 wire format)src/bernstein/core/persistence/lineage_signer.py(attachment-as-parent helper)Acceptance criteria
bernstein run "<prompt>" --attach ./shot.pngsucceeds end to end against the Claude and Gemini adapters; the attached image reaches the model API request body as base64 with the correct MIME type.--attachinvocation records an audit-chain entry containing (sha256(image), mime, operator_install_id_sig, worker_id, turn_seq, prev_chain_digest); replay reproduces the exact bytes sent to the model API on the original turn.--attachon an adapter whereis_multimodal_capablereturns False fails with a clear error before any process is launched; the error suggests adapters that support attachments.Test plan
uv run pytest tests/unit/test_multimodal.py tests/unit/test_multimodal_attestation.py tests/integration/test_run_attach.py-- 101 passed.uv run pytest tests/unit/ -k "multimodal or attach"-- 90 passed.uv run pytest tests/unit/ -k "task_model or models or plan_loader or cas_store or lineage_signer"-- 250 passed.uv run pytest tests/unit/ -k "adapter and (claude or gemini)"-- 299 passed, 5 skipped.uv run pytest tests/unit/test_audit_dsse.py tests/unit/test_audit_chain_byteflip_regression.py tests/unit/test_audit_export.py-- 32 passed.uv run ruff check . --fix && uv run ruff format .-- clean.uv run pyright src/bernstein/core/agents/multimodal_attestation.py src/bernstein/core/security/audit_chain.py src/bernstein/core/persistence/lineage_signer.py-- 0 errors.Summary by CodeRabbit
New Features
Documentation
Tests