Skip to content

feat(cli): add --keep-sandbox flag to fullsend run#1595

Merged
rh-hemartin merged 1 commit into
mainfrom
feat/keep-sandbox
Jun 4, 2026
Merged

feat(cli): add --keep-sandbox flag to fullsend run#1595
rh-hemartin merged 1 commit into
mainfrom
feat/keep-sandbox

Conversation

@rh-hemartin

Copy link
Copy Markdown
Member

Summary

  • Adds --keep-sandbox flag to fullsend run
  • When set, skips sandbox deletion at end of run regardless of success or failure
  • Prints the sandbox name and the exact openshell sandbox exec command to drop into a shell

Why

Post-failure sandbox inspection currently requires re-running with extra instrumentation. This flag lets developers exec directly into the sandbox after a failure to check uploaded files, env contents, or reproduce commands manually.

Closes #1594

Test plan

  • Run fullsend run triage --keep-sandbox and confirm sandbox is not deleted
  • Confirm printed openshell sandbox exec command drops into a shell in the sandbox
  • Run without flag and confirm sandbox is still deleted normally

🤖 Generated with Claude Code

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown

Site preview

Preview: https://9e4133b1-site.fullsend-ai.workers.dev

Commit: 05be86297184924ba690620f5977e8449a9967ee

@fullsend-ai-review

fullsend-ai-review Bot commented May 27, 2026

Copy link
Copy Markdown

Review

Findings

Medium

  • [intent-alignment] internal/cli/run.go:531 — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress (new claudeOutputPath parameter and file creation logic) and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture introduces a file write into the agent execution hot path and changes the I/O plumbing. While the implementation is correct and degrades gracefully on file-creation failure, it should be described in the PR summary so reviewers can evaluate it independently.
    Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate PR.

Low

  • [test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The TeeReader integration (file creation failure graceful degradation, drain-on-error path writing remaining output to file) is testable and worth covering.

  • [naming-convention] internal/cli/run.go:1081 — The error variable ferr does not follow the established naming pattern in this file. All other scoped error variables use descriptive <operation>Err names (chmodErr, clearErr, extractErr, execErr, valErr, dlErr, statErr). The name ferr is a terse abbreviation that does not communicate what operation failed.
    Remediation: Rename ferr to createErr to match the codebase convention.

  • [docs-currency] docs/guides/user/running-agents-locally.md:165 — The user guide documents --no-post-script as a tip for avoiding side-effects when running agents locally. The new --keep-sandbox flag serves a similar debugging purpose (preserving the sandbox for inspection) and would naturally belong alongside this tip.
    Remediation: Add a note near --no-post-script documenting --keep-sandbox.

  • [docs-currency] docs/guides/dev/cli-internals.md:304 — The sandbox lifecycle diagram shows the final step as Delete() unconditionally. With the new --keep-sandbox flag, this step is now conditionally skipped. The diagram does not reflect this behavioral change.
    Remediation: Update the sandbox lifecycle diagram to indicate that Delete() is skipped when --keep-sandbox is passed.

Previous run

Review

Findings

Medium

  • [intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture introduces a file write into the agent execution hot path and changes the I/O plumbing (replacing direct stdout reads with a TeeReader). While the implementation is correct and degrades gracefully on file-creation failure, it should be described in the PR summary so reviewers can evaluate it independently.
    Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate PR.

Low

  • [docs-currency] docs/guides/user/running-agents-locally.md — The user guide has a "Testing without side effects" section (line 365) documenting --no-post-script for debugging workflows. The new --keep-sandbox flag serves a similar debugging purpose (post-failure sandbox inspection) and would naturally belong in that section or a nearby "Debugging sandbox failures" subsection.
    Remediation: Add a short paragraph documenting --keep-sandbox near the --no-post-script section.

  • [test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (2)

Review

Findings

Medium

  • [intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture is a useful companion feature for post-failure debugging, but it changes the I/O path of the agent runner (introducing a file write into the hot path) and should be described in the PR summary so reviewers can evaluate it independently. Consider either (a) adding a section to the PR body explaining the JSONL capture rationale, or (b) splitting it into a separate PR.

Low

  • [docs-currency] docs/guides/user/running-agents-locally.md — The user guide documents fullsend run flags inline (e.g., --no-post-script, --env-file, --fullsend-binary) but will not mention the new --keep-sandbox flag after this PR merges. Consider adding a brief mention alongside the --no-post-script documentation.

  • [test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The existing test file (run_test.go) tests flag registration but not runtime behavior. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (3)

Review

Findings

Medium

  • [intent-alignment] internal/cli/run.go — PR includes undocumented scope beyond cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594: the claude-output.jsonl stream capture via io.TeeReader in runAgentWithProgress and the removal of the --verbose TODO comment in buildClaudeCommand are not mentioned in the PR body or authorized by the linked issue. The JSONL capture is a useful companion feature for post-failure debugging, but it changes the I/O path of the agent runner (introducing a file write into the hot path) and should be described in the PR summary so reviewers can evaluate it independently. Consider either (a) adding a section to the PR body explaining the JSONL capture rationale, or (b) splitting it into a separate PR.

Low

  • [docs-currency] docs/guides/user/running-agents-locally.md — The user guide documents fullsend run flags inline (e.g., --no-post-script, --env-file, --fullsend-binary) but will not mention the new --keep-sandbox flag after this PR merges. Consider adding a brief mention alongside the --no-post-script documentation.

  • [test-coverage] internal/cli/run.go — No unit tests for the --keep-sandbox defer-path behavior or the TeeReader wiring in runAgentWithProgress. The existing test file (run_test.go) tests flag registration but not runtime behavior. The TeeReader integration (file creation failure degradation, drain-on-error path) is particularly testable and worth covering.

Previous run (4)

Review

Findings

Medium

  • [intent-alignment] internal/cli/run.go — PR bundles undocumented changes beyond the --keep-sandbox feature described in the PR body and issue cli: add --keep-sandbox flag to fullsend run for post-failure inspection #1594. The claude-output.jsonl TeeReader capture (new outputPath parameter in runAgentWithProgress, file creation at iterDir/claude-output.jsonl) is a distinct feature that tees agent stdout to disk. The --verbose comment removal in buildClaudeCommand is also undocumented. These changes are individually reasonable but should be described in the PR body or split into separate commits so reviewers can evaluate each change on its own merits.
    Remediation: Either (a) update the PR description to document the output-capture feature and explain why it belongs in this PR, or (b) split it into a separate commit/PR.

Low

  • [docs-currency] docs/guides/user/running-agents-locally.md — The user guide has a "Testing without side effects" section (line 365) that documents --no-post-script for debugging workflows. The new --keep-sandbox flag serves a similar debugging purpose (post-failure sandbox inspection) and would naturally belong in that section or a nearby "Debugging sandbox failures" section.
    Remediation: Add a short paragraph documenting --keep-sandbox near the --no-post-script section.

Info

  • [test-coverage] internal/cli/run.go — No test coverage for the keepSandbox early-return path or the new outputPath parameter in runAgentWithProgress. These are CLI integration paths that are difficult to unit test, but the TeeReader fallback behavior (graceful degradation when os.Create fails) would benefit from a targeted test.

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label May 27, 2026
@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels May 27, 2026
@rh-hemartin rh-hemartin self-assigned this May 28, 2026
@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels May 28, 2026
@ralphbean

Copy link
Copy Markdown
Member

/fs-review

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jun 1, 2026

@ralphbean ralphbean left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A few non-blocking notes inline.

Comment thread internal/cli/run.go
Comment thread internal/cli/run.go
Comment thread internal/cli/run.go
Skips sandbox deletion at end of run when --keep-sandbox is set.
Prints sandbox name and openshell exec command for direct inspection.
Useful for post-failure debugging without re-running the full agent.

Closes #1594

Signed-off-by: Hector Martinez <hemartin@redhat.com>
@rh-hemartin rh-hemartin enabled auto-merge June 4, 2026 06:48
@rh-hemartin rh-hemartin added this pull request to the merge queue Jun 4, 2026
Merged via the queue into main with commit 213eb75 Jun 4, 2026
8 checks passed
@rh-hemartin rh-hemartin deleted the feat/keep-sandbox branch June 4, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

requires-manual-review Review requires human judgment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cli: add --keep-sandbox flag to fullsend run for post-failure inspection

2 participants