Bug Report
Summary
When a k6 test aborts due to a threshold failure (abortOnFail: true) while --linger is active, the k6 REST API (GET /v1/status) continues returning "running": true — even after k6 prints the linger message. This makes it impossible for external systems polling /v1/status to detect that the test has finished.
Steps to Reproduce
- Write a k6 script with an immediately-breaching threshold:
export const options = {
thresholds: {
http_req_failed: [{ threshold: 'rate<0.01', abortOnFail: true }],
},
};
- Run with
--linger:
k6 run --linger script.js
- Cause the threshold to breach (e.g., all requests fail).
- Observe k6 prints:
The test is done, but --linger was enabled, so k6 is waiting for Ctrl+C to continue...
- Poll
GET http://localhost:6565/v1/status while k6 is in linger mode.
Expected Behavior
After threshold abort + linger message, /v1/status should return "running": false. The test execution is complete; k6 is only waiting for a termination signal.
Actual Behavior
/v1/status returns "running": true indefinitely while k6 is in linger mode after threshold abort. The test is done but the API reports it as still running.
Normal Case (Works Correctly)
When a test completes normally (no threshold abort) and --linger is active:
- k6 prints the linger message
/v1/status correctly returns "running": false
This confirms the bug is specific to the threshold abort code path.
Impact
Systems that poll /v1/status to detect k6 completion cannot detect test completion when threshold abort occurs with --linger active. This leaves runner processes alive indefinitely.
In k6-operator's PLZ (Private Load Zone) mode, --linger is injected automatically for all runs:
// pkg/resources/jobs/runner.go
if v1alpha1.IsTrue(k6, v1alpha1.CloudPLZTestRun) {
command = append(command, "--no-setup", "--no-teardown", "--linger")
}
The operator's StoppedJobs() polls /v1/status every ~15 seconds; when it always gets running: true, the TestRun stays in started stage indefinitely.
Production evidence: We observed a PLZ TestRun stuck in started for 1+ hour. Logs showed the linger message at ~14 seconds into a 14-minute scenario, with /v1/status returning running: true for the entire duration. All VUs had errored (gRPC auth failure → 0 complete, 27 interrupted iterations).
Environment
- Platform: Kubernetes (Linux)
- Reproduced via k6-operator PLZ mode with standard runner image
Proposed Fix
After threshold abort triggers linger, /v1/status should return "running": false — no VUs are active, the test engine has stopped. Only a SIGTERM/SIGINT would terminate the process.
Related Issues
Bug Report
Summary
When a k6 test aborts due to a threshold failure (
abortOnFail: true) while--lingeris active, the k6 REST API (GET /v1/status) continues returning"running": true— even after k6 prints the linger message. This makes it impossible for external systems polling/v1/statusto detect that the test has finished.Steps to Reproduce
--linger:GET http://localhost:6565/v1/statuswhile k6 is in linger mode.Expected Behavior
After threshold abort + linger message,
/v1/statusshould return"running": false. The test execution is complete; k6 is only waiting for a termination signal.Actual Behavior
/v1/statusreturns"running": trueindefinitely while k6 is in linger mode after threshold abort. The test is done but the API reports it as still running.Normal Case (Works Correctly)
When a test completes normally (no threshold abort) and
--lingeris active:/v1/statuscorrectly returns"running": falseThis confirms the bug is specific to the threshold abort code path.
Impact
Systems that poll
/v1/statusto detect k6 completion cannot detect test completion when threshold abort occurs with--lingeractive. This leaves runner processes alive indefinitely.In k6-operator's PLZ (Private Load Zone) mode,
--lingeris injected automatically for all runs:The operator's
StoppedJobs()polls/v1/statusevery ~15 seconds; when it always getsrunning: true, the TestRun stays instartedstage indefinitely.Production evidence: We observed a PLZ TestRun stuck in
startedfor 1+ hour. Logs showed the linger message at ~14 seconds into a 14-minute scenario, with/v1/statusreturningrunning: truefor the entire duration. All VUs had errored (gRPC auth failure → 0 complete, 27 interrupted iterations).Environment
Proposed Fix
After threshold abort triggers linger,
/v1/statusshould return"running": false— no VUs are active, the test engine has stopped. Only a SIGTERM/SIGINT would terminate the process.Related Issues