Skip to content

feat(run): image attachment passthrough with provenance#1811

Merged
chernistry merged 3 commits into
mainfrom
feat/1797-multimodal-attestation
May 21, 2026
Merged

feat(run): image attachment passthrough with provenance#1811
chernistry merged 3 commits into
mainfrom
feat/1797-multimodal-attestation

Conversation

@chernistry

@chernistry chernistry commented May 21, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds --attach <path> (repeatable) to bernstein run. Capability gate refuses non-multimodal adapters BEFORE spawning.
  • Implements the provenance contract for Image attachment passthrough with provenance #1797: each attach records an HMAC-chained multimodal.attach event (sha256, mime, install_id_sig, worker_id, turn_seq, prev_chain_digest, worktree_id); bytes stored once in CAS; lineage v1 receipt parents carry the attachment digest.
  • Worktree pinning enforced at resolve time.

Closes #1797.

Files touched

  • New: src/bernstein/core/agents/multimodal_attestation.py
  • New: src/bernstein/core/security/audit_chain.py
  • New: tests/unit/test_multimodal_attestation.py (23 cases)
  • New: tests/integration/test_run_attach.py (round-trip + CLI flag)
  • New: docs/operations/run.md
  • Modified: src/bernstein/core/tasks/models.py (Task.attachments field + from_dict)
  • Modified: src/bernstein/core/planning/plan_loader.py (YAML attachments key)
  • Modified: src/bernstein/cli/run_bootstrap.py (--attach option + capability gate)
  • Modified: src/bernstein/adapters/base.py (multimodal_context= keyword on spawn)
  • Modified: src/bernstein/adapters/claude.py (base64 wire format)
  • Modified: src/bernstein/adapters/gemini.py (base64 wire format)
  • Modified: src/bernstein/core/persistence/lineage_signer.py (attachment-as-parent helper)

Acceptance criteria

  • bernstein run "<prompt>" --attach ./shot.png succeeds end to end against the Claude and Gemini adapters; the attached image reaches the model API request body as base64 with the correct MIME type.
  • Each --attach invocation records an audit-chain entry containing (sha256(image), mime, operator_install_id_sig, worker_id, turn_seq, prev_chain_digest); replay reproduces the exact bytes sent to the model API on the original turn.
  • The worker's lineage v1 receipt for any artefact produced this turn carries the input image's sha256 in its parents; tamper with the bytes and the chain fails verification.
  • An image attached to a worker in worktree wt-a is unreachable from wt-b workers in the same session; the chain entry encodes the worktree id and the resolver enforces the boundary.
  • Task YAML accepts an attachments list field; the orchestrator builds a MultiModalContext at spawn time from the listed paths and passes it to the adapter.
  • Spawning with --attach on an adapter where is_multimodal_capable returns False fails with a clear error before any process is launched; the error suggests adapters that support attachments.
  • Tests cover spawn-time wiring (unit), round-trip on a stubbed Claude adapter (integration), audit-chain entry shape, lineage parent inclusion, cross-worktree isolation, and capability-gating refusal.

Test plan

  • uv run pytest tests/unit/test_multimodal.py tests/unit/test_multimodal_attestation.py tests/integration/test_run_attach.py -- 101 passed.
  • uv run pytest tests/unit/ -k "multimodal or attach" -- 90 passed.
  • uv run pytest tests/unit/ -k "task_model or models or plan_loader or cas_store or lineage_signer" -- 250 passed.
  • uv run pytest tests/unit/ -k "adapter and (claude or gemini)" -- 299 passed, 5 skipped.
  • uv run pytest tests/unit/test_audit_dsse.py tests/unit/test_audit_chain_byteflip_regression.py tests/unit/test_audit_export.py -- 32 passed.
  • uv run ruff check . --fix && uv run ruff format . -- clean.
  • uv run pyright src/bernstein/core/agents/multimodal_attestation.py src/bernstein/core/security/audit_chain.py src/bernstein/core/persistence/lineage_signer.py -- 0 errors.

Summary by CodeRabbit

  • New Features

    • Added CLI --attach to pass files to multimodal-capable adapters (Claude, Gemini); attachments are base64-inlined into prompts with SHA‑256 digests, recorded as lineage parents, and rejected early for incapable adapters.
    • Worktree-pinned attachment resolution enforcing cross-worktree denial and replay-safe provenance.
  • Documentation

    • Operator guide for the run --attach workflow, attachment wire format, provenance, and adapter compatibility.
  • Tests

    • Comprehensive integration and unit tests covering CLI, adapter injection, provenance, audit-chain, and cross-worktree isolation.

Review Change Stack

Adds the operator-facing --attach surface and the spawn-time
provenance contract: each attached image is anchored to the run via
an HMAC-chained audit event, the content-addressed blob store, and
the worker's lineage v1 receipt parents.

Changes:
- Add Task.attachments list field and plumb it through Task.from_dict.
- New module src/bernstein/core/agents/multimodal_attestation.py:
  build_attachment_context() reads paths, stores bytes in CAS,
  records the audit event, and returns a MultiModalContext.
- New module src/bernstein/core/security/audit_chain.py: AuditChainStore
  facade over AuditLog plus the additive multimodal.attach event type
  and the record_multimodal_attach() helper.
- Additive helpers in core/persistence/lineage_signer.py for the
  attachment-as-parent URI scheme.
- CLI: --attach option on bernstein run, repeatable, validated path,
  with capability gating before any process is launched.
- Adapters: Claude and Gemini accept multimodal_context= and inline
  base64-encoded attachments with the documented wire format.
- YAML plan loader honours an attachments: list on each step.
- Worktree pinning enforced at resolve time; cross-worktree attempts
  raise WorktreeAccessDenied.
- Documentation in docs/operations/run.md.
- Tests:
  - tests/unit/test_multimodal_attestation.py: 23 cases covering
    Task model field, capability gating, sha256 stability, audit
    record shape, lineage parents, worktree isolation, replay,
    tamper detection, chain continuity, and YAML plan loader.
  - tests/integration/test_run_attach.py: end-to-end stub-adapter
    round-trip plus CLI option validation.

Closes #1797
@coderabbitai

coderabbitai Bot commented May 21, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f10891fa-ea26-4896-b4a6-10c4273678c3

📥 Commits

Reviewing files that changed from the base of the PR and between 5dd41fc and c05305b.

📒 Files selected for processing (9)
  • docs/operations/run.md
  • src/bernstein/adapters/claude.py
  • src/bernstein/adapters/gemini.py
  • src/bernstein/core/agents/multimodal_attestation.py
  • src/bernstein/core/persistence/lineage_signer.py
  • src/bernstein/core/planning/plan_loader.py
  • src/bernstein/core/security/audit_chain.py
  • src/bernstein/core/tasks/models.py
  • tests/unit/test_multimodal_attestation.py

📝 Walkthrough

Walkthrough

This PR implements operator-driven multimodal image attachment passthrough with content-addressed storage, HMAC-chained audit recording, and worktree-pinned access control. Operators invoke bernstein run --attach image.png to pass images to Claude/Gemini adapters; images are base64-encoded into prompts, SHA-256 hashed, persisted to CAS, recorded in the audit chain, and linked as lineage receipt parents. Workers in other worktrees cannot resolve attachments from adjacent worktrees.

Changes

Multimodal Attachment Passthrough with Provenance

Layer / File(s) Summary
Task Model, Audit Infrastructure & Lineage Parents
src/bernstein/core/tasks/models.py, src/bernstein/core/security/audit_chain.py, src/bernstein/core/persistence/lineage_signer.py
Task gains attachments: list[str] field (deserialized from server payload). AuditChainStore wraps AuditLog, exposing prev_chain_digest and log_with_prev_digest; new EVENT_MULTIMODAL_ATTACH constant, MultimodalAttachDetails dataclass, and record_multimodal_attach helper. Lineage module adds build_attachment_parent_uri(sha256) and register_attachment_parents(...) to convert attachment digests into canonical parent URIs.
Adapter Interface & Multimodal Injection
src/bernstein/adapters/base.py, src/bernstein/adapters/claude.py, src/bernstein/adapters/gemini.py, src/bernstein/core/agents/multimodal_attestation.py
CLIAdapter.spawn abstract method gains optional multimodal_context: Any | None parameter. Claude and Gemini each add _inject_multimodal_attachments(prompt, context) to prepend base64 <attachment> blocks with MIME and SHA-256 when inputs exist.
CLI Flag & YAML Integration
src/bernstein/cli/run_bootstrap.py, src/bernstein/cli/main.py, src/bernstein/core/planning/plan_loader.py
--attach repeatable Click option added to bernstein run; main.py wires attach=() into run callback. _run_impl enforces multimodal capability gating via refuse_when_incapable and exports attachment paths to subprocess via BERNSTEIN_RUN_ATTACHMENTS. Plan loader parses optional attachments list from YAML steps into Task field.
Attachment Context Building & Worker Resolution
src/bernstein/core/agents/multimodal_attestation.py
build_attachment_context(attachments, worker_id, turn_seq, worktree_id, cas, audit_chain) decodes/encodes inputs, computes SHA-256 digests, stores bytes in CAS, records per-attachment multimodal.attach events with install signature, worker and worktree identifiers; resolve_attachment_for_worker(sha256, requesting_worktree_id, cas, audit_chain) enforces worktree ownership and returns CAS bytes or raises WorktreeAccessDenied. Helpers: encode_one(path) and worker_lineage_parents(result).
Unit & Integration Test Coverage
tests/integration/test_run_attach.py, tests/unit/test_multimodal_attestation.py
Integration tests cover --attach flag repeatability, missing path rejection, base64 encoding in adapter prompts, lineage parent registration, audit-chain tamper detection, capability refusal, cross-worktree isolation, and Gemini injection format. Unit tests validate Task.attachments behavior, gating, SHA-256 determinism, context build lifecycle, worktree pinning, audit events, concurrency, and plan-loader propagation.
Operator Documentation
docs/operations/run.md
New doc describes --attach flag, supported adapters (claude, gemini), base64 <attachment> wire format, provenance flow (CAS + multimodal.attach audit events + lineage receipt augmentation), worktree pinning behavior, YAML attachments: syntax, and references to implementation modules.

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

  • sipyourdrink-ltd/bernstein#1741: Both PRs modify src/bernstein/adapters/gemini.py (GeminiAdapter.spawn signature and prompt construction), so integration and merge conflict resolution will be necessary.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.99% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding image attachment passthrough with provenance tracking.
Description check ✅ Passed The PR description comprehensively covers the What/Why/How for the entire changeset, with an Acceptance Criteria section demonstrating full coverage of requirements from #1797.
Linked Issues check ✅ Passed All requirements from #1797 are met: operator --attach flag [run_bootstrap.py +46], multimodal context building [multimodal_attestation.py +367], audit-chain anchoring [audit_chain.py +241], worktree pinning [multimodal_attestation.py resolve_attachment_for_worker], task model YAML wiring [tasks/models.py +7, plan_loader.py +4], capability gating [run_bootstrap.py pre-launch check], and comprehensive test coverage [test_run_attach.py +327, test_multimodal_attestation.py +521].
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the #1797 objectives: CLI attachment support, multimodal context building, audit anchoring, lineage parent registration, capability gating, adapter wiring, tests, and docs. No unrelated changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/1797-multimodal-attestation

Comment @coderabbitai help to get the list of available commands and usage tips.

@chernistry chernistry enabled auto-merge (squash) May 21, 2026 20:28

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @chernistry, you have reached your weekly rate limit of 2500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Sonar insights (advisory, no merge-block)

Snapshot of bernstein on the configured Sonar instance:

Metric Value
Coverage 13.5
Code smells 107
Bugs 11
Vulnerabilities 2
Security hotspots 91

Run bernstein doctor sonar locally for the full surface.

This comment is a soft signal. The Sonar scan runs on push to main; the PR check itself never fails on smells.

@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Review-bot acknowledgement summary

  • Must-address findings: 9 (9 acknowledged, 0 open)
  • Informational findings: 0

All must-address findings are resolved or acknowledged.

@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

bernstein doctor observe for PR #1811 (feat/1797-multimodal-attestation): ok=0, warn=2, fail=0, error=0, skipped=2

sonar -- WARN (project bernstein)

metric value delta threshold status
coverage_pct 13.5% new 80.0% fail
code_smells 107 new 50 warn
bugs 11 new 0 fail
vulnerabilities 2 new 0 warn
security_hotspots 91 new 0 fail

code-scanning -- WARN (38 open alert(s))

metric value delta threshold status
open_alerts 38 new 0 fail
critical_alerts 0 new 0 ok
high_alerts 2 new 0 warn
medium_alerts 0 new - ok
low_alerts 0 new - ok
Skipped backends (credentials not configured)
  • glitchtip: BERNSTEIN_GLITCHTIP_TOKEN not set
  • dt: DTRACK_URL/TOKEN/PROJECT not set

See docs/observability/unified-doctor.md for backend setup notes.

Auto-applied by contract-drift-autofix.yml on PR #1811.
Regenerated via scripts/regen_contract_drift.py. Refs #1273.

Source CI run: https://github.com/sipyourdrink-ltd/bernstein/actions/runs/26251274378

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/bernstein/adapters/gemini.py (1)

214-243: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do not route attachment blobs through argv.

In src/bernstein/adapters/gemini.py Lines 221-243, attachment base64 is injected into prompt and then passed as binary -p <prompt>. That makes multimodal runs vulnerable to OS command-line size limits, so a handful of medium images can fail before the CLI even launches.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/adapters/gemini.py` around lines 214 - 243, The current
implementation injects base64 attachment blobs into prompt via
_inject_multimodal_attachments and then passes that huge string in cmd (binary
-p prompt), which risks OS argv size limits; instead, remove embedding binary
data into prompt in the method that builds cmd (stop calling
_inject_multimodal_attachments into prompt) and persist multimodal_context to a
temporary file under the existing workdir (e.g., log_path.parent) or stream it
via stdin, then modify the command built in resolve_google_cli_binary usage to
pass the attachment file/stream reference (for example a --attachments-file or
use stdin) rather than inline base64; update any callers that expect injected
text and ensure the temp file is securely created and cleaned up after
SpawnResult completes.
src/bernstein/adapters/claude.py (1)

380-381: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do not send attachment payloads through -p.

In src/bernstein/adapters/claude.py Lines 564-570, base64 attachments are prepended to prompt, and Lines 380-381 still pass the full blob as one argv argument. A few screenshots is enough to hit OS command-line size limits and fail spawn with E2BIG before Claude starts.

Also applies to: 549-570

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/bernstein/adapters/claude.py` around lines 380 - 381, The code currently
prepends base64 attachment blobs into the prompt and then calls
cmd.extend(["-p", prompt]) (in src/bernstein/adapters/claude.py, around the
prompt/attachment handling and the cmd construction), which can hit OS argv size
limits; change the flow so attachments are not passed inline in the argv: for
each attachment detected in the prompt-building code (the block that prepends
base64 around lines 549-570) write the decoded payload to a temporary file
(e.g., tempfile.NamedTemporaryFile) and remove the blob from the in-memory
prompt, then update the CLI invocation built with cmd (the place using
cmd.extend(["-p", prompt])) to instead pass a small prompt string and pass
attachment file paths via a separate flag or a stable convention (e.g., "-a",
file_path for each temp file) or feed the prompt via stdin (use '-' and pipe the
prompt) so you never pass large base64 data as a single argv entry.
docs/operations/run.md (1)

1-101: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add required operational sections for the new --attach workflow.

Lines 11-92 document behavior but omit explicit environment-variable references, health-check steps, and troubleshooting procedures for operators.

Suggested structure
+## Environment variables
+- `BERNSTEIN_RUN_ATTACHMENTS` (internal propagation for spawned worker context)
+
+## Health checks
+```sh
+bernstein run --help
+bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run
+```
+
+## Troubleshooting
+- Capability refusal when adapter is not multimodal-capable.
+- Cross-worktree resolution denial (`WorktreeAccessDenied`) and recovery steps.
+- Audit-chain tamper verification failure handling steps.

As per coding guidelines: "docs/operations/**/*.md: In deployment and operational documentation, include complete configuration examples, environment variable references, health check procedures, and troubleshooting guides for each deployment scenario".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/operations/run.md` around lines 1 - 101, The doc is missing operational
details for the new --attach workflow; add explicit environment-variable
references (e.g., variables controlling CAS path and audit signing),
health-check steps (e.g., test commands: bernstein run --help and a dry-run like
bernstein run --goal "probe" --attach ./screenshot.png --cli claude --dry-run),
and a Troubleshooting section covering capability refusals, WorktreeAccessDenied
recovery, and audit-chain tamper verification handling. In the same file update:
reference the concrete symbols and paths used by the implementation
(MultiModalContext, multimodal.attach event, multimodal_attestation.py resolver,
register_attachment_parents in lineage_signer.py and the AuditChainStore) and
include short recovery commands or checks operators can run to validate CAS
entries, signature verification, and cross-worktree access. Ensure examples show
both CLI and task YAML usage and mention relevant env vars that affect CAS/audit
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/operations/run.md`:
- Around line 24-30: Update the "Capable adapters" section to show operator
commands for verifying local adapter capabilities instead of only listing static
names: mention using `bernstein adapters list` to quickly enumerate available
adapters and `bernstein adapters check` to perform full conformance inventory
(binary path, version, capabilities, contract validation), and update the
explanatory text so that users are instructed to run those commands before using
`--attach` to confirm which adapters (e.g., claude, gemini) actually support
attachments.

In `@src/bernstein/adapters/claude.py`:
- Around line 71-112: The _inject_multimodal_attachments function currently
computes sha256 by rereading content_path which can diverge from the actual
bytes sent (content_base64); change it so the digest is computed from the exact
bytes used in the payload: if inp.content_base64 is present, decode that base64
to bytes and hash those bytes (falling back to reading Path(content_path) only
when content_base64 is absent), handle base64 decoding errors by setting digest
to "" (or logging) and ensure the mime_type and the produced b64 payload remain
unchanged; update references to inputs, content_base64, content_path, and
mime_type in the function to reflect this sourcing order so the sha256 always
matches the inline attachment bytes.

In `@src/bernstein/adapters/gemini.py`:
- Around line 159-191: _inject_multimodal_attachments is computing sha256 from
the filesystem via content_path rather than from the actual transmitted bytes in
content_base64, which can create a mismatch; change the logic in
_inject_multimodal_attachments to derive the digest from the decoded base64
payload (use base64.b64decode on b64) when content_base64 is present, falling
back to reading content_path only if content_base64 is empty, and ensure
exceptions during decode/read set digest to "" and do not crash so blocks.append
still uses the correct mime and computed sha256.

In `@src/bernstein/core/agents/multimodal_attestation.py`:
- Line 39: The attestation currently re-reads files via inp.content_path when
saving hashes/CAS, which can diverge from the attachment bytes already produced
in build_multimodal_context(paths); change build_multimodal_context to capture
and return the exact bytes used for each attachment (the payloads passed to
multimodal.attach) and use those same in-memory bytes when computing hashes and
storing to CAS and when creating multimodal.attach entries (avoid any subsequent
file re-reads of inp.content_path); update the code paths that reference
inp.content_path (and the multimodal.attach/CAS calls) to accept the bytes
returned from build_multimodal_context so the audit chain records and stores the
identical bytes the model actually received.
- Around line 325-337: The resolver is currently matching events by sha256 only
and then taking the last event across all worktrees; change the lookup to
resolve by the tuple (sha256, worktree_id) instead: when querying audit_chain
(audit_chain.query(event_type=EVENT_MULTIMODAL_ATTACH)) filter events where both
e.details.get("sha256") == sha256 and e.details.get("worktree_id") ==
requesting_worktree_id, raise FileNotFoundError if no such per-worktree matches
exist, then pick the most recent event from that filtered list and proceed with
the existing attached_worktree check and potential WorktreeAccessDenied raise;
update references to matches, attached_worktree, and the error conditions
accordingly.

In `@src/bernstein/core/persistence/lineage_signer.py`:
- Around line 181-193: The build_attachment_parent_uri function currently only
checks sha256 length; update it to also validate that sha256 is a valid
lowercase hex string (i.e., only 0-9 and a-f) before returning
f"{_ATTACHMENT_PARENT_SCHEME}{sha256}" and raise LineageSignerError if
validation fails; locate the function build_attachment_parent_uri and add a
hex-check (e.g., regex or bytes.fromhex validation) against the input and ensure
the error message still reports the bad value/length when raising
LineageSignerError.

In `@src/bernstein/core/planning/plan_loader.py`:
- Around line 254-257: The code currently assumes step.get("attachments") is a
list and does list comprehension over it (attachments: list[str] = [str(a) for a
in (step.get("attachments") or [])]), which will iterate a scalar string
character-by-character; update the logic in plan_loader.py to validate the value
returned by step.get("attachments") before iterating: if it's None treat as
empty list, if it's a list map each element to str, and if it's any other type
raise a clear parsing error (or explicitly reject/non-list with a
TypeError/ValueError). Apply the same validation pattern where attachments is
computed elsewhere (the other occurrence around the function/section at the
later occurrence) so non-list YAML values are rejected at parse time rather than
iterated.

In `@src/bernstein/core/tasks/models.py`:
- Line 512: The list comprehension attachments=[str(a) for a in
raw.get("attachments", [])] will treat a string like "diagram.png" as an
iterable of characters; instead, normalize raw.get("attachments") into a proper
list first (e.g., assign attachments_raw = raw.get("attachments", []) and if
isinstance(attachments_raw, str) then wrap it: attachments_list =
[attachments_raw]; if attachments_raw is None set to []; otherwise ensure it's
an iterable/list) and then use attachments=[str(a) for a in attachments_list];
update the code around the attachments assignment to perform this normalization
so string-valued attachments are handled safely.

---

Outside diff comments:
In `@docs/operations/run.md`:
- Around line 1-101: The doc is missing operational details for the new --attach
workflow; add explicit environment-variable references (e.g., variables
controlling CAS path and audit signing), health-check steps (e.g., test
commands: bernstein run --help and a dry-run like bernstein run --goal "probe"
--attach ./screenshot.png --cli claude --dry-run), and a Troubleshooting section
covering capability refusals, WorktreeAccessDenied recovery, and audit-chain
tamper verification handling. In the same file update: reference the concrete
symbols and paths used by the implementation (MultiModalContext,
multimodal.attach event, multimodal_attestation.py resolver,
register_attachment_parents in lineage_signer.py and the AuditChainStore) and
include short recovery commands or checks operators can run to validate CAS
entries, signature verification, and cross-worktree access. Ensure examples show
both CLI and task YAML usage and mention relevant env vars that affect CAS/audit
behavior.

In `@src/bernstein/adapters/claude.py`:
- Around line 380-381: The code currently prepends base64 attachment blobs into
the prompt and then calls cmd.extend(["-p", prompt]) (in
src/bernstein/adapters/claude.py, around the prompt/attachment handling and the
cmd construction), which can hit OS argv size limits; change the flow so
attachments are not passed inline in the argv: for each attachment detected in
the prompt-building code (the block that prepends base64 around lines 549-570)
write the decoded payload to a temporary file (e.g.,
tempfile.NamedTemporaryFile) and remove the blob from the in-memory prompt, then
update the CLI invocation built with cmd (the place using cmd.extend(["-p",
prompt])) to instead pass a small prompt string and pass attachment file paths
via a separate flag or a stable convention (e.g., "-a", file_path for each temp
file) or feed the prompt via stdin (use '-' and pipe the prompt) so you never
pass large base64 data as a single argv entry.

In `@src/bernstein/adapters/gemini.py`:
- Around line 214-243: The current implementation injects base64 attachment
blobs into prompt via _inject_multimodal_attachments and then passes that huge
string in cmd (binary -p prompt), which risks OS argv size limits; instead,
remove embedding binary data into prompt in the method that builds cmd (stop
calling _inject_multimodal_attachments into prompt) and persist
multimodal_context to a temporary file under the existing workdir (e.g.,
log_path.parent) or stream it via stdin, then modify the command built in
resolve_google_cli_binary usage to pass the attachment file/stream reference
(for example a --attachments-file or use stdin) rather than inline base64;
update any callers that expect injected text and ensure the temp file is
securely created and cleaned up after SpawnResult completes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5d277667-c254-40ce-8673-a14eb068ef6c

📥 Commits

Reviewing files that changed from the base of the PR and between 6160ea6 and 5dd41fc.

📒 Files selected for processing (13)
  • docs/operations/run.md
  • src/bernstein/adapters/base.py
  • src/bernstein/adapters/claude.py
  • src/bernstein/adapters/gemini.py
  • src/bernstein/cli/main.py
  • src/bernstein/cli/run_bootstrap.py
  • src/bernstein/core/agents/multimodal_attestation.py
  • src/bernstein/core/persistence/lineage_signer.py
  • src/bernstein/core/planning/plan_loader.py
  • src/bernstein/core/security/audit_chain.py
  • src/bernstein/core/tasks/models.py
  • tests/integration/test_run_attach.py
  • tests/unit/test_multimodal_attestation.py

Comment thread docs/operations/run.md
Comment thread src/bernstein/adapters/claude.py
Comment thread src/bernstein/adapters/gemini.py
Comment thread src/bernstein/core/agents/multimodal_attestation.py
Comment thread src/bernstein/core/agents/multimodal_attestation.py
Comment thread src/bernstein/core/persistence/lineage_signer.py
Comment thread src/bernstein/core/planning/plan_loader.py
Comment thread src/bernstein/core/security/audit_chain.py
Comment thread src/bernstein/core/tasks/models.py Outdated
Round-trip and audit fidelity:
- multimodal_attestation: hash + store the base64-decoded bytes from
  content_base64 (the bytes that actually travel to the model API)
  instead of re-reading content_path. Eliminates the race where the
  on-disk file changes between encode and attest time and the audit
  record no longer matches the inlined bytes.
- claude / gemini _inject_multimodal_attachments: compute the
  announced sha256 from the decoded base64 payload, not from a fresh
  filesystem read.

Resolver correctness:
- resolve_attachment_for_worker filters chain events by
  (sha256, worktree_id) before picking, so the same bytes attached
  in wt-a AND wt-b both resolve in their own worktree. Previously a
  later attach in wt-b could shadow a valid wt-a resolve.

Input validation:
- build_attachment_parent_uri rejects non-hex strings (e.g. 'x'*64)
  with a structured LineageSignerError.
- plan_loader rejects scalar `attachments` (e.g. 'attachments: shot.png')
  with a PlanLoadError instead of iterating the string character by
  character.
- Task.from_dict raises TypeError on scalar attachments payloads via
  a new _normalize_attachments helper, with a clear hint.

Concurrency:
- AuditChainStore.log_with_prev_digest holds a threading.Lock around
  the (read prev_chain_digest -> append) pair so two concurrent
  attaches always embed distinct predecessors and the on-disk chain
  stays linear.

Docs:
- docs/operations/run.md adds explicit verification commands
  (bernstein adapters list/check, bernstein doctor) and a pointer to
  the canonical capability function.

New tests:
- Hash-matches-base64 invariant after on-disk mutation.
- Cross-worktree dual-attach: both worktrees resolve independently.
- Non-hex / uppercase digest rejection.
- Atomic prev_chain_digest under 16-thread concurrent appends.
- Task.from_dict scalar attachments rejected.
- plan_loader scalar attachments rejected.

bot-ack: 3284182740
bot-ack: 3284182744
bot-ack: 3284182752
bot-ack: 3284182756
bot-ack: 3284182761
bot-ack: 3284182781
bot-ack: 3284182784
bot-ack: 3284182792
bot-ack: 3284182800
@chernistry chernistry merged commit 139ffa3 into main May 21, 2026
61 of 64 checks passed
@chernistry chernistry deleted the feat/1797-multimodal-attestation branch May 21, 2026 20:47
A new list with attachment parents appended.
"""
seen: set[str] = set(parents)
out: list[str] = list(parents)
(bot-ack: 3284182792 -- CodeRabbit major.)
"""
with self._append_lock:
merged: dict[str, Any] = dict(details)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image attachment passthrough with provenance

2 participants