feat(run): image attachment passthrough with provenance by chernistry · Pull Request #1811 · sipyourdrink-ltd/bernstein

chernistry · 2026-05-21T20:28:42Z

Summary

Adds --attach <path> (repeatable) to bernstein run. Capability gate refuses non-multimodal adapters BEFORE spawning.
Implements the provenance contract for Image attachment passthrough with provenance #1797: each attach records an HMAC-chained multimodal.attach event (sha256, mime, install_id_sig, worker_id, turn_seq, prev_chain_digest, worktree_id); bytes stored once in CAS; lineage v1 receipt parents carry the attachment digest.
Worktree pinning enforced at resolve time.

Closes #1797.

Files touched

New: src/bernstein/core/agents/multimodal_attestation.py
New: src/bernstein/core/security/audit_chain.py
New: tests/unit/test_multimodal_attestation.py (23 cases)
New: tests/integration/test_run_attach.py (round-trip + CLI flag)
New: docs/operations/run.md
Modified: src/bernstein/core/tasks/models.py (Task.attachments field + from_dict)
Modified: src/bernstein/core/planning/plan_loader.py (YAML attachments key)
Modified: src/bernstein/cli/run_bootstrap.py (--attach option + capability gate)
Modified: src/bernstein/adapters/base.py (multimodal_context= keyword on spawn)
Modified: src/bernstein/adapters/claude.py (base64 wire format)
Modified: src/bernstein/adapters/gemini.py (base64 wire format)
Modified: src/bernstein/core/persistence/lineage_signer.py (attachment-as-parent helper)

Acceptance criteria

bernstein run "<prompt>" --attach ./shot.png succeeds end to end against the Claude and Gemini adapters; the attached image reaches the model API request body as base64 with the correct MIME type.
Each --attach invocation records an audit-chain entry containing (sha256(image), mime, operator_install_id_sig, worker_id, turn_seq, prev_chain_digest); replay reproduces the exact bytes sent to the model API on the original turn.
The worker's lineage v1 receipt for any artefact produced this turn carries the input image's sha256 in its parents; tamper with the bytes and the chain fails verification.
An image attached to a worker in worktree wt-a is unreachable from wt-b workers in the same session; the chain entry encodes the worktree id and the resolver enforces the boundary.
Task YAML accepts an attachments list field; the orchestrator builds a MultiModalContext at spawn time from the listed paths and passes it to the adapter.
Spawning with --attach on an adapter where is_multimodal_capable returns False fails with a clear error before any process is launched; the error suggests adapters that support attachments.
Tests cover spawn-time wiring (unit), round-trip on a stubbed Claude adapter (integration), audit-chain entry shape, lineage parent inclusion, cross-worktree isolation, and capability-gating refusal.

Test plan

uv run pytest tests/unit/test_multimodal.py tests/unit/test_multimodal_attestation.py tests/integration/test_run_attach.py -- 101 passed.
uv run pytest tests/unit/ -k "multimodal or attach" -- 90 passed.
uv run pytest tests/unit/ -k "task_model or models or plan_loader or cas_store or lineage_signer" -- 250 passed.
uv run pytest tests/unit/ -k "adapter and (claude or gemini)" -- 299 passed, 5 skipped.
uv run pytest tests/unit/test_audit_dsse.py tests/unit/test_audit_chain_byteflip_regression.py tests/unit/test_audit_export.py -- 32 passed.
uv run ruff check . --fix && uv run ruff format . -- clean.
uv run pyright src/bernstein/core/agents/multimodal_attestation.py src/bernstein/core/security/audit_chain.py src/bernstein/core/persistence/lineage_signer.py -- 0 errors.

Summary by CodeRabbit

New Features
- Added CLI --attach to pass files to multimodal-capable adapters (Claude, Gemini); attachments are base64-inlined into prompts with SHA‑256 digests, recorded as lineage parents, and rejected early for incapable adapters.
- Worktree-pinned attachment resolution enforcing cross-worktree denial and replay-safe provenance.
Documentation
- Operator guide for the run --attach workflow, attachment wire format, provenance, and adapter compatibility.
Tests
- Comprehensive integration and unit tests covering CLI, adapter injection, provenance, audit-chain, and cross-worktree isolation.

Adds the operator-facing --attach surface and the spawn-time provenance contract: each attached image is anchored to the run via an HMAC-chained audit event, the content-addressed blob store, and the worker's lineage v1 receipt parents. Changes: - Add Task.attachments list field and plumb it through Task.from_dict. - New module src/bernstein/core/agents/multimodal_attestation.py: build_attachment_context() reads paths, stores bytes in CAS, records the audit event, and returns a MultiModalContext. - New module src/bernstein/core/security/audit_chain.py: AuditChainStore facade over AuditLog plus the additive multimodal.attach event type and the record_multimodal_attach() helper. - Additive helpers in core/persistence/lineage_signer.py for the attachment-as-parent URI scheme. - CLI: --attach option on bernstein run, repeatable, validated path, with capability gating before any process is launched. - Adapters: Claude and Gemini accept multimodal_context= and inline base64-encoded attachments with the documented wire format. - YAML plan loader honours an attachments: list on each step. - Worktree pinning enforced at resolve time; cross-worktree attempts raise WorktreeAccessDenied. - Documentation in docs/operations/run.md. - Tests: - tests/unit/test_multimodal_attestation.py: 23 cases covering Task model field, capability gating, sha256 stability, audit record shape, lineage parents, worktree isolation, replay, tamper detection, chain continuity, and YAML plan loader. - tests/integration/test_run_attach.py: end-to-end stub-adapter round-trip plus CLI option validation. Closes #1797

coderabbitai · 2026-05-21T20:28:49Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f10891fa-ea26-4896-b4a6-10c4273678c3

📥 Commits

Reviewing files that changed from the base of the PR and between 5dd41fc and c05305b.

📒 Files selected for processing (9)

docs/operations/run.md
src/bernstein/adapters/claude.py
src/bernstein/adapters/gemini.py
src/bernstein/core/agents/multimodal_attestation.py
src/bernstein/core/persistence/lineage_signer.py
src/bernstein/core/planning/plan_loader.py
src/bernstein/core/security/audit_chain.py
src/bernstein/core/tasks/models.py
tests/unit/test_multimodal_attestation.py

📝 Walkthrough

Walkthrough

This PR implements operator-driven multimodal image attachment passthrough with content-addressed storage, HMAC-chained audit recording, and worktree-pinned access control. Operators invoke bernstein run --attach image.png to pass images to Claude/Gemini adapters; images are base64-encoded into prompts, SHA-256 hashed, persisted to CAS, recorded in the audit chain, and linked as lineage receipt parents. Workers in other worktrees cannot resolve attachments from adjacent worktrees.

Changes

Multimodal Attachment Passthrough with Provenance

Layer / File(s)	Summary
Task Model, Audit Infrastructure & Lineage Parents `src/bernstein/core/tasks/models.py`, `src/bernstein/core/security/audit_chain.py`, `src/bernstein/core/persistence/lineage_signer.py`	`Task` gains `attachments: list[str]` field (deserialized from server payload). `AuditChainStore` wraps `AuditLog`, exposing `prev_chain_digest` and `log_with_prev_digest`; new `EVENT_MULTIMODAL_ATTACH` constant, `MultimodalAttachDetails` dataclass, and `record_multimodal_attach` helper. Lineage module adds `build_attachment_parent_uri(sha256)` and `register_attachment_parents(...)` to convert attachment digests into canonical parent URIs.
Adapter Interface & Multimodal Injection `src/bernstein/adapters/base.py`, `src/bernstein/adapters/claude.py`, `src/bernstein/adapters/gemini.py`, `src/bernstein/core/agents/multimodal_attestation.py`	`CLIAdapter.spawn` abstract method gains optional `multimodal_context: Any \| None` parameter. Claude and Gemini each add `_inject_multimodal_attachments(prompt, context)` to prepend base64 `<attachment>` blocks with MIME and SHA-256 when inputs exist.
CLI Flag & YAML Integration `src/bernstein/cli/run_bootstrap.py`, `src/bernstein/cli/main.py`, `src/bernstein/core/planning/plan_loader.py`	`--attach` repeatable Click option added to `bernstein run`; main.py wires `attach=()` into run callback. `_run_impl` enforces multimodal capability gating via `refuse_when_incapable` and exports attachment paths to subprocess via `BERNSTEIN_RUN_ATTACHMENTS`. Plan loader parses optional `attachments` list from YAML steps into Task field.
Attachment Context Building & Worker Resolution `src/bernstein/core/agents/multimodal_attestation.py`	`build_attachment_context(attachments, worker_id, turn_seq, worktree_id, cas, audit_chain)` decodes/encodes inputs, computes SHA-256 digests, stores bytes in CAS, records per-attachment `multimodal.attach` events with install signature, worker and worktree identifiers; `resolve_attachment_for_worker(sha256, requesting_worktree_id, cas, audit_chain)` enforces worktree ownership and returns CAS bytes or raises `WorktreeAccessDenied`. Helpers: `encode_one(path)` and `worker_lineage_parents(result)`.
Unit & Integration Test Coverage `tests/integration/test_run_attach.py`, `tests/unit/test_multimodal_attestation.py`	Integration tests cover `--attach` flag repeatability, missing path rejection, base64 encoding in adapter prompts, lineage parent registration, audit-chain tamper detection, capability refusal, cross-worktree isolation, and Gemini injection format. Unit tests validate Task.attachments behavior, gating, SHA-256 determinism, context build lifecycle, worktree pinning, audit events, concurrency, and plan-loader propagation.
Operator Documentation `docs/operations/run.md`	New doc describes `--attach` flag, supported adapters (`claude`, `gemini`), base64 `<attachment>` wire format, provenance flow (CAS + `multimodal.attach` audit events + lineage receipt augmentation), worktree pinning behavior, YAML `attachments:` syntax, and references to implementation modules.

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

sipyourdrink-ltd/bernstein#1741: Both PRs modify src/bernstein/adapters/gemini.py (GeminiAdapter.spawn signature and prompt construction), so integration and merge conflict resolution will be necessary.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.99% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely summarizes the main change: adding image attachment passthrough with provenance tracking.
Description check	✅ Passed	The PR description comprehensively covers the What/Why/How for the entire changeset, with an Acceptance Criteria section demonstrating full coverage of requirements from `#1797`.
Linked Issues check	✅ Passed	All requirements from `#1797` are met: operator --attach flag [run_bootstrap.py +46], multimodal context building [multimodal_attestation.py +367], audit-chain anchoring [audit_chain.py +241], worktree pinning [multimodal_attestation.py resolve_attachment_for_worker], task model YAML wiring [tasks/models.py +7, plan_loader.py +4], capability gating [run_bootstrap.py pre-launch check], and comprehensive test coverage [test_run_attach.py +327, test_multimodal_attestation.py +521].
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing the `#1797` objectives: CLI attachment support, multimodal context building, audit anchoring, lineage parent registration, capability gating, adapter wiring, tests, and docs. No unrelated changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/1797-multimodal-attestation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Sorry @chernistry, you have reached your weekly rate limit of 2500000 diff characters.

Please try again later or upgrade to continue using Sourcery

github-actions · 2026-05-21T20:28:58Z

Sonar insights (advisory, no merge-block)

Snapshot of bernstein on the configured Sonar instance:

Metric	Value
Coverage	13.5
Code smells	107
Bugs	11
Vulnerabilities	2
Security hotspots	91

Run bernstein doctor sonar locally for the full surface.

This comment is a soft signal. The Sonar scan runs on push to main; the PR check itself never fails on smells.

github-actions · 2026-05-21T20:29:09Z

Review-bot acknowledgement summary

Must-address findings: 9 (9 acknowledged, 0 open)
Informational findings: 0

All must-address findings are resolved or acknowledged.

github-actions · 2026-05-21T20:29:29Z

bernstein doctor observe for PR #1811 (feat/1797-multimodal-attestation): ok=0, warn=2, fail=0, error=0, skipped=2

sonar -- WARN (project bernstein)

metric	value	delta	threshold	status
coverage_pct	13.5%	new	80.0%	fail
code_smells	107	new	50	warn
bugs	11	new	0	fail
vulnerabilities	2	new	0	warn
security_hotspots	91	new	0	fail

code-scanning -- WARN (38 open alert(s))

metric	value	delta	threshold	status
open_alerts	38	new	0	fail
critical_alerts	0	new	0	ok
high_alerts	2	new	0	warn
medium_alerts	0	new	-	ok
low_alerts	0	new	-	ok

Skipped backends (credentials not configured)

glitchtip: BERNSTEIN_GLITCHTIP_TOKEN not set
dt: DTRACK_URL/TOKEN/PROJECT not set

See docs/observability/unified-doctor.md for backend setup notes.

Auto-applied by contract-drift-autofix.yml on PR #1811. Regenerated via scripts/regen_contract_drift.py. Refs #1273. Source CI run: https://github.com/sipyourdrink-ltd/bernstein/actions/runs/26251274378

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/bernstein/adapters/gemini.py (1)

214-243: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do not route attachment blobs through argv.

In src/bernstein/adapters/gemini.py Lines 221-243, attachment base64 is injected into prompt and then passed as binary -p <prompt>. That makes multimodal runs vulnerable to OS command-line size limits, so a handful of medium images can fail before the CLI even launches.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/adapters/gemini.py` around lines 214 - 243, The current
implementation injects base64 attachment blobs into prompt via
_inject_multimodal_attachments and then passes that huge string in cmd (binary
-p prompt), which risks OS argv size limits; instead, remove embedding binary
data into prompt in the method that builds cmd (stop calling
_inject_multimodal_attachments into prompt) and persist multimodal_context to a
temporary file under the existing workdir (e.g., log_path.parent) or stream it
via stdin, then modify the command built in resolve_google_cli_binary usage to
pass the attachment file/stream reference (for example a --attachments-file or
use stdin) rather than inline base64; update any callers that expect injected
text and ensure the temp file is securely created and cleaned up after
SpawnResult completes.

src/bernstein/adapters/claude.py (1)

380-381: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do not send attachment payloads through -p.

In src/bernstein/adapters/claude.py Lines 564-570, base64 attachments are prepended to prompt, and Lines 380-381 still pass the full blob as one argv argument. A few screenshots is enough to hit OS command-line size limits and fail spawn with E2BIG before Claude starts.

Also applies to: 549-570

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/adapters/claude.py` around lines 380 - 381, The code currently
prepends base64 attachment blobs into the prompt and then calls
cmd.extend(["-p", prompt]) (in src/bernstein/adapters/claude.py, around the
prompt/attachment handling and the cmd construction), which can hit OS argv size
limits; change the flow so attachments are not passed inline in the argv: for
each attachment detected in the prompt-building code (the block that prepends
base64 around lines 549-570) write the decoded payload to a temporary file
(e.g., tempfile.NamedTemporaryFile) and remove the blob from the in-memory
prompt, then update the CLI invocation built with cmd (the place using
cmd.extend(["-p", prompt])) to instead pass a small prompt string and pass
attachment file paths via a separate flag or a stable convention (e.g., "-a",
file_path for each temp file) or feed the prompt via stdin (use '-' and pipe the
prompt) so you never pass large base64 data as a single argv entry.

docs/operations/run.md (1)

1-101: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add required operational sections for the new --attach workflow.

Lines 11-92 document behavior but omit explicit environment-variable references, health-check steps, and troubleshooting procedures for operators.

Suggested structure

+## Environment variables
+- `BERNSTEIN_RUN_ATTACHMENTS` (internal propagation for spawned worker context)
+
+## Health checks
+```sh
+bernstein run --help
+bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run
+```
+
+## Troubleshooting
+- Capability refusal when adapter is not multimodal-capable.
+- Cross-worktree resolution denial (`WorktreeAccessDenied`) and recovery steps.
+- Audit-chain tamper verification failure handling steps.

As per coding guidelines: "docs/operations/**/*.md: In deployment and operational documentation, include complete configuration examples, environment variable references, health check procedures, and troubleshooting guides for each deployment scenario".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/operations/run.md` around lines 1 - 101, The doc is missing operational
details for the new --attach workflow; add explicit environment-variable
references (e.g., variables controlling CAS path and audit signing),
health-check steps (e.g., test commands: bernstein run --help and a dry-run like
bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run),
and a Troubleshooting section covering capability refusals, WorktreeAccessDenied
recovery, and audit-chain tamper verification handling. In the same file update:
reference the concrete symbols and paths used by the implementation
(MultiModalContext, multimodal.attach event, multimodal_attestation.py resolver,
register_attachment_parents in lineage_signer.py and the AuditChainStore) and
include short recovery commands or checks operators can run to validate CAS
entries, signature verification, and cross-worktree access. Ensure examples show
both CLI and task YAML usage and mention relevant env vars that affect CAS/audit
behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/operations/run.md`:
- Around line 24-30: Update the "Capable adapters" section to show operator
commands for verifying local adapter capabilities instead of only listing static
names: mention using `bernstein adapters list` to quickly enumerate available
adapters and `bernstein adapters check` to perform full conformance inventory
(binary path, version, capabilities, contract validation), and update the
explanatory text so that users are instructed to run those commands before using
`--attach` to confirm which adapters (e.g., claude, gemini) actually support
attachments.

In `@src/bernstein/adapters/claude.py`:
- Around line 71-112: The _inject_multimodal_attachments function currently
computes sha256 by rereading content_path which can diverge from the actual
bytes sent (content_base64); change it so the digest is computed from the exact
bytes used in the payload: if inp.content_base64 is present, decode that base64
to bytes and hash those bytes (falling back to reading Path(content_path) only
when content_base64 is absent), handle base64 decoding errors by setting digest
to "" (or logging) and ensure the mime_type and the produced b64 payload remain
unchanged; update references to inputs, content_base64, content_path, and
mime_type in the function to reflect this sourcing order so the sha256 always
matches the inline attachment bytes.

In `@src/bernstein/adapters/gemini.py`:
- Around line 159-191: _inject_multimodal_attachments is computing sha256 from
the filesystem via content_path rather than from the actual transmitted bytes in
content_base64, which can create a mismatch; change the logic in
_inject_multimodal_attachments to derive the digest from the decoded base64
payload (use base64.b64decode on b64) when content_base64 is present, falling
back to reading content_path only if content_base64 is empty, and ensure
exceptions during decode/read set digest to "" and do not crash so blocks.append
still uses the correct mime and computed sha256.

In `@src/bernstein/core/agents/multimodal_attestation.py`:
- Line 39: The attestation currently re-reads files via inp.content_path when
saving hashes/CAS, which can diverge from the attachment bytes already produced
in build_multimodal_context(paths); change build_multimodal_context to capture
and return the exact bytes used for each attachment (the payloads passed to
multimodal.attach) and use those same in-memory bytes when computing hashes and
storing to CAS and when creating multimodal.attach entries (avoid any subsequent
file re-reads of inp.content_path); update the code paths that reference
inp.content_path (and the multimodal.attach/CAS calls) to accept the bytes
returned from build_multimodal_context so the audit chain records and stores the
identical bytes the model actually received.
- Around line 325-337: The resolver is currently matching events by sha256 only
and then taking the last event across all worktrees; change the lookup to
resolve by the tuple (sha256, worktree_id) instead: when querying audit_chain
(audit_chain.query(event_type=EVENT_MULTIMODAL_ATTACH)) filter events where both
e.details.get("sha256") == sha256 and e.details.get("worktree_id") ==
requesting_worktree_id, raise FileNotFoundError if no such per-worktree matches
exist, then pick the most recent event from that filtered list and proceed with
the existing attached_worktree check and potential WorktreeAccessDenied raise;
update references to matches, attached_worktree, and the error conditions
accordingly.

In `@src/bernstein/core/persistence/lineage_signer.py`:
- Around line 181-193: The build_attachment_parent_uri function currently only
checks sha256 length; update it to also validate that sha256 is a valid
lowercase hex string (i.e., only 0-9 and a-f) before returning
f"{_ATTACHMENT_PARENT_SCHEME}{sha256}" and raise LineageSignerError if
validation fails; locate the function build_attachment_parent_uri and add a
hex-check (e.g., regex or bytes.fromhex validation) against the input and ensure
the error message still reports the bad value/length when raising
LineageSignerError.

In `@src/bernstein/core/planning/plan_loader.py`:
- Around line 254-257: The code currently assumes step.get("attachments") is a
list and does list comprehension over it (attachments: list[str] = [str(a) for a
in (step.get("attachments") or [])]), which will iterate a scalar string
character-by-character; update the logic in plan_loader.py to validate the value
returned by step.get("attachments") before iterating: if it's None treat as
empty list, if it's a list map each element to str, and if it's any other type
raise a clear parsing error (or explicitly reject/non-list with a
TypeError/ValueError). Apply the same validation pattern where attachments is
computed elsewhere (the other occurrence around the function/section at the
later occurrence) so non-list YAML values are rejected at parse time rather than
iterated.

In `@src/bernstein/core/tasks/models.py`:
- Line 512: The list comprehension attachments=[str(a) for a in
raw.get("attachments", [])] will treat a string like "diagram.png" as an
iterable of characters; instead, normalize raw.get("attachments") into a proper
list first (e.g., assign attachments_raw = raw.get("attachments", []) and if
isinstance(attachments_raw, str) then wrap it: attachments_list =
[attachments_raw]; if attachments_raw is None set to []; otherwise ensure it's
an iterable/list) and then use attachments=[str(a) for a in attachments_list];
update the code around the attachments assignment to perform this normalization
so string-valued attachments are handled safely.

---

Outside diff comments:
In `@docs/operations/run.md`:
- Around line 1-101: The doc is missing operational details for the new --attach
workflow; add explicit environment-variable references (e.g., variables
controlling CAS path and audit signing), health-check steps (e.g., test
commands: bernstein run --help and a dry-run like bernstein run --goal "probe"
--attach ./screenshot.png --cli claude --dry-run), and a Troubleshooting section
covering capability refusals, WorktreeAccessDenied recovery, and audit-chain
tamper verification handling. In the same file update: reference the concrete
symbols and paths used by the implementation (MultiModalContext,
multimodal.attach event, multimodal_attestation.py resolver,
register_attachment_parents in lineage_signer.py and the AuditChainStore) and
include short recovery commands or checks operators can run to validate CAS
entries, signature verification, and cross-worktree access. Ensure examples show
both CLI and task YAML usage and mention relevant env vars that affect CAS/audit
behavior.

In `@src/bernstein/adapters/claude.py`:
- Around line 380-381: The code currently prepends base64 attachment blobs into
the prompt and then calls cmd.extend(["-p", prompt]) (in
src/bernstein/adapters/claude.py, around the prompt/attachment handling and the
cmd construction), which can hit OS argv size limits; change the flow so
attachments are not passed inline in the argv: for each attachment detected in
the prompt-building code (the block that prepends base64 around lines 549-570)
write the decoded payload to a temporary file (e.g.,
tempfile.NamedTemporaryFile) and remove the blob from the in-memory prompt, then
update the CLI invocation built with cmd (the place using cmd.extend(["-p",
prompt])) to instead pass a small prompt string and pass attachment file paths
via a separate flag or a stable convention (e.g., "-a", file_path for each temp
file) or feed the prompt via stdin (use '-' and pipe the prompt) so you never
pass large base64 data as a single argv entry.

In `@src/bernstein/adapters/gemini.py`:
- Around line 214-243: The current implementation injects base64 attachment
blobs into prompt via _inject_multimodal_attachments and then passes that huge
string in cmd (binary -p prompt), which risks OS argv size limits; instead,
remove embedding binary data into prompt in the method that builds cmd (stop
calling _inject_multimodal_attachments into prompt) and persist
multimodal_context to a temporary file under the existing workdir (e.g.,
log_path.parent) or stream it via stdin, then modify the command built in
resolve_google_cli_binary usage to pass the attachment file/stream reference
(for example a --attachments-file or use stdin) rather than inline base64;
update any callers that expect injected text and ensure the temp file is
securely created and cleaned up after SpawnResult completes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5d277667-c254-40ce-8673-a14eb068ef6c

📥 Commits

Reviewing files that changed from the base of the PR and between 6160ea6 and 5dd41fc.

📒 Files selected for processing (13)

docs/operations/run.md
src/bernstein/adapters/base.py
src/bernstein/adapters/claude.py
src/bernstein/adapters/gemini.py
src/bernstein/cli/main.py
src/bernstein/cli/run_bootstrap.py
src/bernstein/core/agents/multimodal_attestation.py
src/bernstein/core/persistence/lineage_signer.py
src/bernstein/core/planning/plan_loader.py
src/bernstein/core/security/audit_chain.py
src/bernstein/core/tasks/models.py
tests/integration/test_run_attach.py
tests/unit/test_multimodal_attestation.py

Round-trip and audit fidelity: - multimodal_attestation: hash + store the base64-decoded bytes from content_base64 (the bytes that actually travel to the model API) instead of re-reading content_path. Eliminates the race where the on-disk file changes between encode and attest time and the audit record no longer matches the inlined bytes. - claude / gemini _inject_multimodal_attachments: compute the announced sha256 from the decoded base64 payload, not from a fresh filesystem read. Resolver correctness: - resolve_attachment_for_worker filters chain events by (sha256, worktree_id) before picking, so the same bytes attached in wt-a AND wt-b both resolve in their own worktree. Previously a later attach in wt-b could shadow a valid wt-a resolve. Input validation: - build_attachment_parent_uri rejects non-hex strings (e.g. 'x'*64) with a structured LineageSignerError. - plan_loader rejects scalar `attachments` (e.g. 'attachments: shot.png') with a PlanLoadError instead of iterating the string character by character. - Task.from_dict raises TypeError on scalar attachments payloads via a new _normalize_attachments helper, with a clear hint. Concurrency: - AuditChainStore.log_with_prev_digest holds a threading.Lock around the (read prev_chain_digest -> append) pair so two concurrent attaches always embed distinct predecessors and the on-disk chain stays linear. Docs: - docs/operations/run.md adds explicit verification commands (bernstein adapters list/check, bernstein doctor) and a pointer to the canonical capability function. New tests: - Hash-matches-base64 invariant after on-disk mutation. - Cross-worktree dual-attach: both worktrees resolve independently. - Non-hex / uppercase digest rejection. - Atomic prev_chain_digest under 16-thread concurrent appends. - Task.from_dict scalar attachments rejected. - plan_loader scalar attachments rejected. bot-ack: 3284182740 bot-ack: 3284182744 bot-ack: 3284182752 bot-ack: 3284182756 bot-ack: 3284182761 bot-ack: 3284182781 bot-ack: 3284182784 bot-ack: 3284182792 bot-ack: 3284182800

+        A new list with attachment parents appended.
+    """
+    seen: set[str] = set(parents)
+    out: list[str] = list(parents)


+        (bot-ack: 3284182792 -- CodeRabbit major.)
+        """
+        with self._append_lock:
+            merged: dict[str, Any] = dict(details)


chernistry enabled auto-merge (squash) May 21, 2026 20:28

sourcery-ai Bot reviewed May 21, 2026

View reviewed changes

github-actions Bot added core cli docs tests adapters size/xl labels May 21, 2026

chore(ci): regenerate contract drift allow-lists

5dd41fc

Auto-applied by contract-drift-autofix.yml on PR #1811. Regenerated via scripts/regen_contract_drift.py. Refs #1273. Source CI run: https://github.com/sipyourdrink-ltd/bernstein/actions/runs/26251274378

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

chernistry merged commit 139ffa3 into main May 21, 2026
61 of 64 checks passed

chernistry deleted the feat/1797-multimodal-attestation branch May 21, 2026 20:47

github-advanced-security AI found potential problems May 21, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request May 22, 2026

Fix adapter multimodal spawn contract #1928

Merged

Uh oh!

Uh oh!

Conversation

chernistry commented May 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files touched

Acceptance criteria

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Possibly Related PRs

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sonar insights (advisory, no merge-block)

Uh oh!

github-actions Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review-bot acknowledgement summary

Uh oh!

github-actions Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

sonar -- WARN (project bernstein)

code-scanning -- WARN (38 open alert(s))

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chernistry commented May 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 21, 2026 •

edited

Loading

github-actions Bot commented May 21, 2026 •

edited

Loading

github-actions Bot commented May 21, 2026 •

edited

Loading

github-actions Bot commented May 21, 2026 •

edited

Loading