feat(cli): add --keep-sandbox flag to fullsend run by rh-hemartin · Pull Request #1595 · fullsend-ai/fullsend

rh-hemartin · 2026-05-27T13:26:50Z

Summary

Adds --keep-sandbox flag to fullsend run
When set, skips sandbox deletion at end of run regardless of success or failure
Prints the sandbox name and the exact openshell sandbox exec command to drop into a shell

Why

Post-failure sandbox inspection currently requires re-running with extra instrumentation. This flag lets developers exec directly into the sandbox after a failure to check uploaded files, env contents, or reproduce commands manually.

Closes #1594

Test plan

Run fullsend run triage --keep-sandbox and confirm sandbox is not deleted
Confirm printed openshell sandbox exec command drops into a shell in the sandbox
Run without flag and confirm sandbox is still deleted normally

🤖 Generated with Claude Code

github-actions · 2026-05-27T13:28:26Z

Site preview

Preview: https://9e4133b1-site.fullsend-ai.workers.dev

Commit: 05be86297184924ba690620f5977e8449a9967ee

fullsend-ai-review · 2026-05-27T13:32:02Z

Review

Findings

Medium

[intent-alignment] internal/cli/run.go:531 — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress (new claudeOutputPath parameter and file creation logic) and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture introduces a file write into the agent execution hot path and changes the I/O plumbing. While the implementation is correct and degrades gracefully on file-creation failure, it should be described in the PR summary so reviewers can evaluate it independently.
Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate PR.

Low

[test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The TeeReader integration (file creation failure graceful degradation, drain-on-error path writing remaining output to file) is testable and worth covering.
[naming-convention] internal/cli/run.go:1081 — The error variable ferr does not follow the established naming pattern in this file. All other scoped error variables use descriptive <operation>Err names (chmodErr, clearErr, extractErr, execErr, valErr, dlErr, statErr). The name ferr is a terse abbreviation that does not communicate what operation failed.
Remediation: Rename ferr to createErr to match the codebase convention.
[docs-currency] docs/guides/user/running-agents-locally.md:165 — The user guide documents --no-post-script as a tip for avoiding side-effects when running agents locally. The new --keep-sandbox flag serves a similar debugging purpose (preserving the sandbox for inspection) and would naturally belong alongside this tip.
Remediation: Add a note near --no-post-script documenting --keep-sandbox.
[docs-currency] docs/guides/dev/cli-internals.md:304 — The sandbox lifecycle diagram shows the final step as Delete() unconditionally. With the new --keep-sandbox flag, this step is now conditionally skipped. The diagram does not reflect this behavioral change.
Remediation: Update the sandbox lifecycle diagram to indicate that Delete() is skipped when --keep-sandbox is passed.

Previous run

Review

Findings

Medium

[intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture introduces a file write into the agent execution hot path and changes the I/O plumbing (replacing direct stdout reads with a TeeReader). While the implementation is correct and degrades gracefully on file-creation failure, it should be described in the PR summary so reviewers can evaluate it independently.
Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate PR.

Low

[docs-currency] docs/guides/user/running-agents-locally.md — The user guide has a "Testing without side effects" section (line 365) documenting --no-post-script for debugging workflows. The new --keep-sandbox flag serves a similar debugging purpose (post-failure sandbox inspection) and would naturally belong in that section or a nearby "Debugging sandbox failures" subsection.
Remediation: Add a short paragraph documenting --keep-sandbox near the --no-post-script section.
[test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (2)

Review

Findings

Medium

[intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture is a useful companion feature for post-failure debugging, but it changes the I/O path of the agent runner (introducing a file write into the hot path) and should be described in the PR summary so reviewers can evaluate it independently. Consider either (a) adding a section to the PR body explaining the JSONL capture rationale, or (b) splitting it into a separate PR.

Low

[docs-currency] docs/guides/user/running-agents-locally.md — The user guide documents fullsend run flags inline (e.g., --no-post-script, --env-file, --fullsend-binary) but will not mention the new --keep-sandbox flag after this PR merges. Consider adding a brief mention alongside the --no-post-script documentation.
[test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The existing test file (run_test.go) tests flag registration but not runtime behavior. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (3)

Review

Findings

Medium

[intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture is a useful companion feature for post-failure debugging, but it changes the I/O path of the agent runner (introducing a file write into the hot path) and should be described in the PR summary so reviewers can evaluate it independently. Consider either (a) adding a section to the PR body explaining the JSONL capture rationale, or (b) splitting it into a separate PR.

Low

[docs-currency] docs/guides/user/running-agents-locally.md — The user guide documents fullsend run flags inline (e.g., --no-post-script, --env-file, --fullsend-binary) but will not mention the new --keep-sandbox flag after this PR merges. Consider adding a brief mention alongside the --no-post-script documentation.
[test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The existing test file (run_test.go) tests flag registration but not runtime behavior. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (4)

Review

Findings

Medium

[intent-alignment] internal/cli/run.go — PR bundles undocumented changes beyond the --keep-sandbox feature described in the PR body and issue cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594. The claude-output.jsonl TeeReader capture (new outputPath parameter in runAgentWithProgress, file creation at iterDir/claude-output.jsonl) is a distinct feature that tees agent stdout to disk. The --verbose comment removal in buildClaudeCommand is also undocumented. These changes are individually reasonable but should be described in the PR body or split into separate commits so reviewers can evaluate each change on its own merits.
Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate commit/PR.

Low

[docs-currency] docs/guides/user/running-agents-locally.md — The user guide has a "Testing without side effects" section (line 365) that documents --no-post-script for debugging workflows. The new --keep-sandbox flag serves a similar debugging purpose (post-failure sandbox inspection) and would naturally belong in that section or a nearby "Debugging sandbox failures" section.
Remediation: Add a short paragraph documenting --keep-sandbox near the --no-post-script section.

Info

[test-coverage] internal/cli/run.go — No test coverage for the keepSandbox early-return path or the new outputPath parameter in runAgentWithProgress. These are CLI integration paths that are difficult to unit test, but the TeeReader fallback behavior (graceful degradation when os.Create fails) would benefit from a targeted test.

ralphbean · 2026-06-01T19:25:35Z

/fs-review

ralphbean

LGTM. A few non-blocking notes inline.

Skips sandbox deletion at end of run when --keep-sandbox is set. Prints sandbox name and openshell exec command for direct inspection. Useful for post-failure debugging without re-running the full agent. Closes #1594 Signed-off-by: Hector Martinez <hemartin@redhat.com>

github-actions Bot deployed to site-preview May 27, 2026 13:28 View deployment

fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label May 27, 2026

rh-hemartin force-pushed the feat/keep-sandbox branch from 48fd0c5 to b0a4fc0 Compare May 27, 2026 15:19

github-actions Bot deployed to site-preview May 27, 2026 15:21 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels May 27, 2026

rh-hemartin self-assigned this May 28, 2026

rh-hemartin force-pushed the feat/keep-sandbox branch from b0a4fc0 to 4138c8e Compare May 28, 2026 07:54

github-actions Bot deployed to site-preview May 28, 2026 07:59 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels May 28, 2026

github-actions Bot deployed to site-preview June 1, 2026 19:26 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jun 1, 2026

ralphbean approved these changes Jun 1, 2026

View reviewed changes

Comment thread internal/cli/run.go

Comment thread internal/cli/run.go

Comment thread internal/cli/run.go

rh-hemartin force-pushed the feat/keep-sandbox branch from aca1652 to 24a3f2d Compare June 4, 2026 06:47

rh-hemartin force-pushed the feat/keep-sandbox branch from 24a3f2d to 05be862 Compare June 4, 2026 06:48

rh-hemartin enabled auto-merge June 4, 2026 06:48

github-actions Bot deployed to site-preview June 4, 2026 06:49 View deployment

rh-hemartin added this pull request to the merge queue Jun 4, 2026

Merged via the queue into main with commit 213eb75 Jun 4, 2026
8 checks passed

rh-hemartin deleted the feat/keep-sandbox branch June 4, 2026 06:54

fullsend-ai-retro Bot mentioned this pull request Jun 4, 2026

Track and reduce E2E test flakiness in CI #1879

Open

qodo-code-review Bot mentioned this pull request Jun 30, 2026

fix(runner): extract sandbox output to temp dir instead of overwriting target repo #2774

Open

4 tasks

Uh oh!

Conversation

rh-hemartin commented May 27, 2026

Summary

Why

Test plan

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Site preview

Uh oh!

fullsend-ai-review Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Findings

Medium

Low

Review

Findings

Medium

Low

Review

Findings

Medium

Low

Review

Findings

Medium

Low

Review

Findings

Medium

Low

Info

Uh oh!

ralphbean commented Jun 1, 2026

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 27, 2026 •

edited

Loading

fullsend-ai-review Bot commented May 27, 2026 •

edited

Loading