Skip to content

Latest commit

 

History

History
70 lines (43 loc) · 4.6 KB

File metadata and controls

70 lines (43 loc) · 4.6 KB

Change success response status of health checks to 200 (OK)

Change the HTTP response status code returned by the Kafka Bridge health check endpoints (/ready and /healthy) to accommodate load balancers that only accept HTTP 200 (OK) responses in health checks.

Current situation

The Kafka Bridge returns HTTP 204 (No Content) for successful health check requests on the /ready and /healthy endpoints which is semantically correct (according to the HTTP standard) but is causing issues with various third-party load-balancers which only treat HTTP 200 (OK) as healthy response code.

Motivation

Cloud load-balancers have varying requirements for health check HTTP status codes:

Strict (require HTTP 200, not configurable)

Configurable (default 200, can be changed)

  • AWS: Defaults to HTTP 200, manual configuration required for other response codes.
  • OCI: Defaults to HTTP 200, supports custom status codes.
  • OVHcloud: Defaults to HTTP 200, supports custom status codes.

Flexible (accept 2xx range by default)

The strict requirements prevent deploying the Kafka Bridge on major cloud platforms without workarounds (custom proxies, modified images, or complex routing). See Appendix for survey of affected open-source projects.

Proposal

Change the successful response status code for the /ready and /healthy endpoints from 204 (No Content) to 200 (OK).

The OpenAPI specification and documentation are updated to reflect the new successful response codes, HTTP 200 (OK).

The error responses (HTTP 500 Internal Server Error) of the /ready and /healthy endpoints remain unchanged.

Affected/not affected projects

This proposal only targets the Strimzi Kafka Bridge.

Compatibility

While this is a breaking change, it will be aligned with the breaking changes from proposal 122 (Add support for TLS/SSL on the HTTP interface) which is introducing a dedicated listener for internal endpoints such as /ready and /healthy.

Rejected alternatives

Making the response status fully configurable

Allowing arbitrary status codes (e.g., KAFKA_BRIDGE_HEALTH_CHECKS_RESPONSE_STATUS=200) adds unnecessary complexity. Only 200 vs. 204 has practical relevance for health checks.

Appendix: Cloud Provider and Open-Source Project Survey

This survey demonstrates that HTTP 204 health check incompatibility is a systemic issue affecting both cloud platforms and open-source projects.

Open-Source Projects Affected

  1. InfluxData Telegraf - /ping endpoint incompatible with GCP health checks
  2. InfluxData InfluxDB - Same issue, suggested making status code configurable
  3. Authentik - Changed from 204 to 200 for multi-cloud compatibility
  4. Meilisearch - Changed to 200 for GCP compatibility
  5. Dapr - Google MultiClusterService requires 200
  6. EventStore - /health/live incompatible with GCP
  7. KairosDB - 204 incompatible with AWS ELB
  8. Thumbor - /healthcheck not AWS ELB compliant
  9. Kubernetes Ingress-Nginx - Changed all 204 codes to 200 for GLBC
  10. Eclipse MicroProfile Health - Specification-level debate on 200 vs. 204