RFC: GitHub PR verification comments from operator by hank-metalbear · Pull Request #9 · metalbear-co/rfcs

hank-metalbear · 2026-02-12T18:08:33Z

Summary

Proposes adding the ability for the mirrord operator to automatically post verification comments on GitHub pull requests when mirrord sessions complete.

When a developer runs mirrord exec against a staging deployment and ends their session, the operator posts a formatted comment on the corresponding PR showing what was tested, against which target, in which namespace, and for how long.

Key design decisions

Operator-based, not cloud-based. Session data never leaves the customer's infrastructure. The only external call is to the customer's own GitHub (or GHE) instance.
Follows the existing Jira webhook pattern. Hooks into the same report_session_finished flow that @gememma built in #925. Estimated ~250 lines of new Rust code.
Zero developer friction. One-time Helm config by cluster admin. No per-repo CI changes, no developer action required beyond using mirrord as usual.
Auth-method agnostic. Works with a PAT, machine user token, or GitHub App installation token — same code path regardless.

Alternatives considered

CLI posts directly using developer's gh auth (rejected: no consistent identity, requires per-developer setup)
mirrord cloud service routes session data (rejected: privacy/compliance concerns with data leaving infra)
GitHub Action per repo queries operator API (rejected: per-repo setup friction)

Requesting review from

@gememma — built the Jira integration this extends
@0x00A5
@Razz4780

🤖 Generated with Claude Code

Proposes adding the ability for the mirrord operator to automatically post verification comments on GitHub PRs when mirrord sessions complete. Follows the existing Jira webhook pattern in the operator codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix #0000 → #9 in RFC header - Add admin dashboard interaction note (license-server based) - Add dashboard integration as future possibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aviramha · 2026-02-13T16:41:31Z

Absolutely love it. My main question would've been how we track the right PR but you solved it here so LGTM from my perspective.

Promote traffic summary from future possibilities to a core part of the proposal. The agent accumulates HTTP metadata (method, path, status, content-type, body samples) in-memory during steal-mode sessions and sends the summary to the operator on session close. The PR comment now includes a traffic table and collapsible request/response body samples. Key design decisions: - Steal mode only (mirror mode doesn't see responses) - Bodies capped at 1KB, max 3 samples per endpoint, 100 endpoint cap - Latest session overwrites previous comment (single comment per PR) - Configurable via Helm (enable/disable bodies, max size) - ~550 lines total across agent, protocol, and operator Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Mirror mode sees requests but not responses, so the traffic summary shows method, path, request count, and request body samples with response columns as unavailable. This is still valuable for proving which endpoints were tested. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

0x00A5 · 2026-02-16T05:40:02Z

+
+- **Team lead enforcing testing policy:** Wants all PRs to be tested against staging before merge. Can see at a glance which PRs have mirrord verification comments and which don't. In the future, this could become a required check via GitHub's branch protection rules.
+
+- **CI pipeline integration:** A CI job runs `mirrord exec` as part of an end-to-end test suite. The operator posts a verification comment showing the CI-driven test session, including which pipeline triggered it.


CI integration should be via mirrord ci commands.

0x00A5 · 2026-02-16T05:56:43Z

+
+The comment body is rendered from a format function that takes the session spec, duration, and optional traffic summary as input.
+
+### Traffic summary capture (agent-side)


Traffic of a session can be served by multiple agents. I suspect we need to accumulate traffic summary in the operator instead.

0x00A5 · 2026-02-16T06:08:11Z

+
+#### Data flow on session close
+
+When the session ends, the accumulated traffic summary is serialized and sent to the operator alongside the existing session metadata. This extends the session-close protocol message:


I don't think the current message protocol between agent and operator carries session metadata. Need more specific design on how to collect session traffic summary.

0x00A5 · 2026-02-16T06:09:49Z

This is a great idea 👍
Adding more details about how to collect traffic summary would be great.

gememma

Seems like a cool idea, as long as its not too difficult to implement in the backend. Thoughts on user experience pasted from the slack thread:

i think it should have to be enabled by an admin, and then opt-in for devs - we want to avoid annoying devs at all costs
i also think it should make use of toggles/ collapsible sections to avoid the comment taking up a lot of vertical space (annoying on PRs)
we would need to find a way to make sure that changes to the comment didnt generate notifications for people in the thread

gememma · 2026-02-16T10:50:44Z

+| **User** | `han` |
+| **Duration** | 4m 32s |
+
+#### Traffic summary


What happens when there are many, many items in the traffic summary? does github have a maximum comment length? will it cut off weirdly if we don't calculate for that ourselves? since "Traffic summary is disabled via Helm configuration", individual devs can't turn it off - I worry this might be annoying as well when there are lots of requests (I also don't trust GitHub not to do something strange like get laggy when there are very long comments)

gememma · 2026-02-16T10:54:14Z

+
+- **Comment noise.** If a developer runs many short mirrord sessions against the same PR, it could create many comments. This needs a strategy — either updating a single comment or batching (see [Unresolved questions](#unresolved-questions)).
+
+- **Git info accuracy.** The CLI derives repo info from the local git remote. If the developer has an unusual remote configuration (e.g., a fork with a different remote name), the owner/repo may be wrong. This is an edge case that can be addressed by allowing explicit configuration.


Related to this, the Jira plugin does not work for developers using jujutsu instead of git CLI, which may be a large number in some companies (and the number may be increasing). Due to the way jujutsu works, you would need special handling to deal with this situation (I actually don't even know if it's possible - I can think of some potential approaches but there are lots of different ways to use jujutsu)

gememma · 2026-02-16T11:10:27Z

+
+- **Individual developer:** Runs `mirrord exec --target deploy/checkout-service -- node server.js`, tests their checkout flow fix against staging, ends the session. A comment automatically appears on their open PR showing what was tested, against which target, and for how long. The reviewer sees this when they open the PR.
+
+- **Team lead enforcing testing policy:** Wants all PRs to be tested against staging before merge. Can see at a glance which PRs have mirrord verification comments and which don't. In the future, this could become a required check via GitHub's branch protection rules.


if we start blocking PRs, we have to make sure there is a contigency plan here for if the operator goes down 👀 and it can't be something like "change the github CI flow" if they cant merge PRs

gememma · 2026-02-16T11:15:36Z

+
+Holds the `reqwest::Client`, token, and API base URL. Initialized once at operator startup if `github_token_path` is configured.
+
+**2. Branch-to-PR matching**


Note that this wont work for jujutsu users (see other comment) because branches work differently in jj.

gememma · 2026-02-16T11:16:45Z

+
+### Traffic summary capture (agent-side)
+
+When the GitHub verification feature is enabled, the agent accumulates HTTP traffic metadata in-memory during the session. Both steal and mirror modes are supported, with different levels of detail:


It would be nice to cut down on the amount of work by not doing this when the user knows that a run is not asociated with a PR, for example by requiring devs to use --github-summary=true when they actually want to use the bot - with the Jira stuff, its a tiny amount of work being done per session (and it fails very quiety), but this is more significant. Otherwise, we rely on the GitHub API to tell us if there's a relevant PR open (latency ⬆️)

hank-metalbear requested review from 0x00A5, Razz4780 and gememma February 12, 2026 18:08

Fix PR references and add admin dashboard context

192e34b

- Fix #0000 → #9 in RFC header - Add admin dashboard interaction note (license-server based) - Add dashboard integration as future possibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

hank-metalbear requested a review from a team February 12, 2026 19:27

hank-metalbear and others added 2 commits February 13, 2026 14:52

0x00A5 reviewed Feb 16, 2026

View reviewed changes

gememma reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: GitHub PR verification comments from operator#9

RFC: GitHub PR verification comments from operator#9
hank-metalbear wants to merge 4 commits intomainfrom
han/rfc-github-pr-verification

hank-metalbear commented Feb 12, 2026

Uh oh!

aviramha commented Feb 13, 2026

Uh oh!

0x00A5 Feb 16, 2026

Uh oh!

0x00A5 Feb 16, 2026

Uh oh!

0x00A5 Feb 16, 2026

Uh oh!

0x00A5 Feb 16, 2026

Uh oh!

gememma left a comment

Uh oh!

gememma Feb 16, 2026

Uh oh!

gememma Feb 16, 2026

Uh oh!

gememma Feb 16, 2026

Uh oh!

gememma Feb 16, 2026

Uh oh!

gememma Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		- Team lead enforcing testing policy: Wants all PRs to be tested against staging before merge. Can see at a glance which PRs have mirrord verification comments and which don't. In the future, this could become a required check via GitHub's branch protection rules.

		- CI pipeline integration: A CI job runs `mirrord exec` as part of an end-to-end test suite. The operator posts a verification comment showing the CI-driven test session, including which pipeline triggered it.


		The comment body is rendered from a format function that takes the session spec, duration, and optional traffic summary as input.

		### Traffic summary capture (agent-side)


		#### Data flow on session close

		When the session ends, the accumulated traffic summary is serialized and sent to the operator alongside the existing session metadata. This extends the session-close protocol message:


		- Comment noise. If a developer runs many short mirrord sessions against the same PR, it could create many comments. This needs a strategy — either updating a single comment or batching (see [Unresolved questions](#unresolved-questions)).

		- Git info accuracy. The CLI derives repo info from the local git remote. If the developer has an unusual remote configuration (e.g., a fork with a different remote name), the owner/repo may be wrong. This is an edge case that can be addressed by allowing explicit configuration.


		- Individual developer: Runs `mirrord exec --target deploy/checkout-service -- node server.js`, tests their checkout flow fix against staging, ends the session. A comment automatically appears on their open PR showing what was tested, against which target, and for how long. The reviewer sees this when they open the PR.

		- Team lead enforcing testing policy: Wants all PRs to be tested against staging before merge. Can see at a glance which PRs have mirrord verification comments and which don't. In the future, this could become a required check via GitHub's branch protection rules.


		Holds the `reqwest::Client`, token, and API base URL. Initialized once at operator startup if `github_token_path` is configured.

		2. Branch-to-PR matching


		### Traffic summary capture (agent-side)

		When the GitHub verification feature is enabled, the agent accumulates HTTP traffic metadata in-memory during the session. Both steal and mirror modes are supported, with different levels of detail:

Conversation

hank-metalbear commented Feb 12, 2026

Summary

Key design decisions

Alternatives considered

Requesting review from

Uh oh!

aviramha commented Feb 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gememma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants