Skip to content

k6 reports running: true from REST API after threshold abort when --linger is active #6073

Description

@mbhargavatb

Bug Report

Summary

When a k6 test aborts due to a threshold failure (abortOnFail: true) while --linger is active, the k6 REST API (GET /v1/status) continues returning "running": true — even after k6 prints the linger message. This makes it impossible for external systems polling /v1/status to detect that the test has finished.

Steps to Reproduce

  1. Write a k6 script with an immediately-breaching threshold:
    export const options = {
      thresholds: {
        http_req_failed: [{ threshold: 'rate<0.01', abortOnFail: true }],
      },
    };
  2. Run with --linger:
    k6 run --linger script.js
    
  3. Cause the threshold to breach (e.g., all requests fail).
  4. Observe k6 prints:
    The test is done, but --linger was enabled, so k6 is waiting for Ctrl+C to continue...
    
  5. Poll GET http://localhost:6565/v1/status while k6 is in linger mode.

Expected Behavior

After threshold abort + linger message, /v1/status should return "running": false. The test execution is complete; k6 is only waiting for a termination signal.

Actual Behavior

/v1/status returns "running": true indefinitely while k6 is in linger mode after threshold abort. The test is done but the API reports it as still running.

Normal Case (Works Correctly)

When a test completes normally (no threshold abort) and --linger is active:

  • k6 prints the linger message
  • /v1/status correctly returns "running": false

This confirms the bug is specific to the threshold abort code path.

Impact

Systems that poll /v1/status to detect k6 completion cannot detect test completion when threshold abort occurs with --linger active. This leaves runner processes alive indefinitely.

In k6-operator's PLZ (Private Load Zone) mode, --linger is injected automatically for all runs:

// pkg/resources/jobs/runner.go
if v1alpha1.IsTrue(k6, v1alpha1.CloudPLZTestRun) {
    command = append(command, "--no-setup", "--no-teardown", "--linger")
}

The operator's StoppedJobs() polls /v1/status every ~15 seconds; when it always gets running: true, the TestRun stays in started stage indefinitely.

Production evidence: We observed a PLZ TestRun stuck in started for 1+ hour. Logs showed the linger message at ~14 seconds into a 14-minute scenario, with /v1/status returning running: true for the entire duration. All VUs had errored (gRPC auth failure → 0 complete, 27 interrupted iterations).

Environment

  • Platform: Kubernetes (Linux)
  • Reproduced via k6-operator PLZ mode with standard runner image

Proposed Fix

After threshold abort triggers linger, /v1/status should return "running": false — no VUs are active, the test engine has stopped. Only a SIGTERM/SIGINT would terminate the process.

Related Issues

Metadata

Metadata

Assignees

Labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions