Change the HTTP response status code returned by the Kafka Bridge health check endpoints (/ready and /healthy) to accommodate load balancers that only accept HTTP 200 (OK) responses in health checks.
The Kafka Bridge returns HTTP 204 (No Content) for successful health check requests on the /ready and /healthy endpoints which is semantically correct (according to the HTTP standard) but is causing issues with various third-party load-balancers which only treat HTTP 200 (OK) as healthy response code.
Cloud load-balancers have varying requirements for health check HTTP status codes:
- AWS: Defaults to HTTP 200, manual configuration required for other response codes.
- OCI: Defaults to HTTP 200, supports custom status codes.
- OVHcloud: Defaults to HTTP 200, supports custom status codes.
- DigitalOcean: Accepts HTTP 2xx and 3xx.
- Linode: Accepts HTTP 2xx and 3xx.
- Alibaba Cloud: Accepts HTTP 2xx.
The strict requirements prevent deploying the Kafka Bridge on major cloud platforms without workarounds (custom proxies, modified images, or complex routing). See Appendix for survey of affected open-source projects.
Change the successful response status code for the /ready and /healthy endpoints from 204 (No Content) to 200 (OK).
The OpenAPI specification and documentation are updated to reflect the new successful response codes, HTTP 200 (OK).
The error responses (HTTP 500 Internal Server Error) of the /ready and /healthy endpoints remain unchanged.
This proposal only targets the Strimzi Kafka Bridge.
While this is a breaking change, it will be aligned with the breaking changes from proposal 122 (Add support for TLS/SSL on the HTTP interface) which is introducing a dedicated listener for internal endpoints such as /ready and /healthy.
Allowing arbitrary status codes (e.g., KAFKA_BRIDGE_HEALTH_CHECKS_RESPONSE_STATUS=200) adds unnecessary complexity. Only 200 vs. 204 has practical relevance for health checks.
This survey demonstrates that HTTP 204 health check incompatibility is a systemic issue affecting both cloud platforms and open-source projects.
- InfluxData Telegraf -
/pingendpoint incompatible with GCP health checks - InfluxData InfluxDB - Same issue, suggested making status code configurable
- Authentik - Changed from 204 to 200 for multi-cloud compatibility
- Meilisearch - Changed to 200 for GCP compatibility
- Dapr - Google MultiClusterService requires 200
- EventStore -
/health/liveincompatible with GCP - KairosDB - 204 incompatible with AWS ELB
- Thumbor -
/healthchecknot AWS ELB compliant - Kubernetes Ingress-Nginx - Changed all 204 codes to 200 for GLBC
- Eclipse MicroProfile Health - Specification-level debate on 200 vs. 204