Review agent should reason about CI infrastructure usage patterns when reviewing optimization PRs

## What happened

On [PR #2796](https://github.com/fullsend-ai/fullsend/pull/2796), the review agent approved a change that removed sandbox image caching from `action.yml` (+1/−62 lines). The agent's [verdict](https://github.com/fullsend-ai/fullsend/pull/2796) was a clean approval: "Looks good to me." However, human reviewer `rh-hemartin` [requested changes](https://github.com/fullsend-ai/fullsend/pull/2796) on Jul 1, noting: "Code (and fix) are the least run agents, and the other do not use the code image, but the base one. Drop the pull entirely and let the sandbox do its work." This was a substantive architectural insight — the remaining `podman pull` step was unnecessary because only rarely-run agents use the code image. The author updated the PR, both the agent and human re-approved, and it merged Jul 2.

## What could go better

The review agent correctly verified that the diff was internally consistent (removing cache logic, dropping skopeo dependency), but it could not assess whether the optimization was *sufficient*. The human reviewer's insight required knowing: (1) which container images are used by which agent types (code/fix use the code image; triage/review/retro use the base image), and (2) that code/fix agents are the least frequently run. This is operational knowledge about the CI infrastructure that isn't documented in `action.yml` or easily inferred from the diff alone.

**Confidence: Medium.** The review agent's approval wasn't wrong per se — the change was correct and beneficial. The gap is that it missed an opportunity to suggest going further. This is a softer failure mode (false negative on an optimization suggestion) than missing a bug. However, it did result in a rework cycle that could have been avoided.

**Existing issue check:** [#2235](https://github.com/fullsend-ai/fullsend/issues/2235) is thematically adjacent (deployment-environment-aware review) but addresses tool availability, not runtime usage patterns. [#1275](https://github.com/fullsend-ai/fullsend/issues/1275) covers workflow_call chain tracing but not image/agent topology. Neither covers the specific gap of understanding which agents use which images and their relative frequency.

## Proposed change

Add a section to the repo's `AGENTS.md` (or a dedicated `docs/guides/dev/ci-infrastructure.md`) documenting the sandbox image topology: which container images exist (base vs code), which agent types use each, and their relative run frequencies. This gives the review agent discoverable context when reviewing CI infrastructure changes.

Specifically, add content like:

```
## Sandbox image topology
- **Base image:** Used by triage, review, retro, and prioritize agents (most frequent)
- **Code image:** Used by code and fix agents only (least frequent)
- The composite action (`action.yml`) pre-pulls images before agent execution
```

This is a repo-specific documentation change — the image topology is specific to fullsend-ai/fullsend's CI setup. The review agent already reads `AGENTS.md` as part of its context gathering, so placing the information there ensures it's discoverable without requiring agent definition changes.

## Validation criteria

On the next 3 PRs that modify `action.yml` or sandbox image configuration in this repo, the review agent should reference the documented image topology in its review reasoning. If a PR touches image pulling or caching logic, the agent should be able to identify which agent types are affected and comment on whether the change scope is appropriate. This can be verified by checking the review agent's trace/reasoning for references to the documented topology.

---
_Generated by retro agent from https://github.com/fullsend-ai/fullsend/pull/2796_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Review agent should reason about CI infrastructure usage patterns when reviewing optimization PRs #2880

What happened

What could go better

Proposed change

Validation criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Review agent should reason about CI infrastructure usage patterns when reviewing optimization PRs #2880

Description

What happened

What could go better

Proposed change

Validation criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions