Skip to content

Commit 573ec6b

Browse files
committed
fix(runner+boxlite/rest): classify stopped-box exec as 409 and refuse overwrite=false over REST
Two related real bugs surfaced by the Tokyo e2e suite once the deserialization gate stopped poisoning every list call: 1. test_files_io.py::test_copy_in_overwrite_false_rejects_conflict Root cause: src/boxlite/src/rest/litebox.rs::copy_into named the CopyOptions argument `_opts` — i.e. dropped on the floor. The REST upload protocol is a single tar that the runner extracts over container_dst unconditionally, so overwrite=false silently turned into overwrite=true and the test's host file replaced the seeded guest content. The test contract explicitly accepts a raised exception ('the runtime behaviour can be either a raised exception or a no-op (FFI raises) — either is acceptable; the contract is the guest content must be unchanged'), so the REST backend now returns BoxliteError::Unsupported up front rather than silently clobbering. A real per-entry skip-if-exists implementation would require an end-to-end protocol change (overwrite query param + runner-side staged-vs-existing check); follow-up work, not in scope for the gate. 2. test_execution_shutdown.py::test_exec_on_stopped_box_is_typed_error Root cause: apps/runner/pkg/api/controllers/boxlite_exec.go :: BoxliteExec wraps every execManager.Start error in a generic 500 ('exec failed: ...'). A stopped-box exec surfaces from libboxlite as the typed Go SDK code 11 (ErrStopped) — the SDK already exposes sdkboxlite.IsStopped/IsInvalidState helpers for exactly this case. Use them to map to 409 Conflict (box is in a state that can't accept execs), so the response is a typed 4xx instead of a leaked 500 'Handle invalidated after stop()'. Local-verified: - cargo test -p boxlite --features rest rest::types::tests + the new build_platform_ref / oversize cases still pass (44 passing, the one pre-existing ws_watchdog_fires_when_idle flaky timing test fails unrelated). - cd apps/runner && go vet ./pkg/api/controllers/ clean (full link needs libboxlite.a which the deploy pipeline rebuilds).
1 parent 2d3adfe commit 573ec6b

2 files changed

Lines changed: 34 additions & 1 deletion

File tree

apps/runner/pkg/api/controllers/boxlite_exec.go

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import (
66
"net/http"
77
"time"
88

9+
sdkboxlite "github.com/boxlite-ai/boxlite/sdks/go"
910
"github.com/boxlite-ai/runner/pkg/boxlite"
1011
"github.com/boxlite-ai/runner/pkg/runner"
1112
"github.com/gin-gonic/gin"
@@ -74,6 +75,19 @@ func BoxliteExec(ctx *gin.Context) {
7475

7576
execId, err := execManager.Start(ctx.Request.Context(), bx, boxId, startOpts)
7677
if err != nil {
78+
// Classify so a stopped / wrong-state box surfaces as 4xx, not
79+
// 500. Without this the box-stopped case leaks
80+
// internal error: HTTP 500 Internal Server Error:
81+
// {"error":"exec failed: failed to start execution: boxlite: stopped:
82+
// Handle invalidated after stop(). ... (code=11)"}
83+
// and the SDK's 'all 5xx is bug' guard fires. The Go SDK already
84+
// gives us a typed bool for the two states that aren't 500 from
85+
// the runner's perspective (the box exists but isn't accepting
86+
// execs right now).
87+
if sdkboxlite.IsStopped(err) || sdkboxlite.IsInvalidState(err) {
88+
ctx.JSON(http.StatusConflict, gin.H{"error": fmt.Sprintf("exec failed: %s", err)})
89+
return
90+
}
7791
ctx.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("exec failed: %s", err)})
7892
return
7993
}

src/boxlite/src/rest/litebox.rs

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,10 +209,29 @@ impl BoxBackend for RestBox {
209209
&self,
210210
host_src: &Path,
211211
container_dst: &str,
212-
_opts: CopyOptions,
212+
opts: CopyOptions,
213213
) -> BoxliteResult<()> {
214214
let box_id = self.box_id_str();
215215

216+
// Honor overwrite=false at the REST boundary. The runner's
217+
// upload handler always extracts the tar over whatever's at
218+
// container_dst (the test in scripts/test/e2e/cases/test_files_io.py::
219+
// test_copy_in_overwrite_false_rejects_conflict was catching
220+
// 'overwrite=False replaced guest content'). The test contract
221+
// explicitly accepts a raised exception OR a no-op — the
222+
// invariant is "guest content must be unchanged". We can't
223+
// express the per-entry "skip if dst exists" semantics over
224+
// the current single-tar upload protocol, so refuse the call
225+
// outright instead of silently clobbering.
226+
if !opts.overwrite {
227+
return Err(BoxliteError::Unsupported(
228+
"copy_into with overwrite=false is not supported over the REST backend; \
229+
the current upload protocol cannot express per-entry skip-if-exists. \
230+
Use the FFI backend or pre-check the destination before calling."
231+
.into(),
232+
));
233+
}
234+
216235
// Create tar archive from host path
217236
let tar_bytes = create_tar_from_path(host_src)?;
218237

0 commit comments

Comments
 (0)