This proposal suggests adding a new listener type (type: tlsroute) based on the Gateway API and its TLSRoute resources.
Strimzi currently supports four types of external listeners:
- Load balancers (
type: loadbalancer) - Node ports (
type: nodeport) - OpenShift Routes (OpenShift only) (
type: route) - Ingress with TLS passthrough (based on the Ingress NGINX Controller for Kubernetes) (
type: ingress)
The Ingress-based listener (type: ingress) is currently deprecated.
The main motivation for the deprecation is the retirement of the Ingress NGINX Controller for Kubernetes, on which this listener type was based.
Kubernetes Gateway API is an official Kubernetes project that defines API specifications for L4 and L7 routing. It does not include an implementation. It provides only the Custom Resource Definitions for the API. The API is then supported/implemented by other projects. A list of implementations can be found on the Gateway API website.
The Gateway API specification is versioned. Its versioning and release cycle are fully independent of the Kubernetes version. Each version has two release channels: Standard and Experimental. The Standard channel contains stable APIs. The Experimental channel includes everything from the Standard channel plus additional experimental changes and APIs that might still change or be removed in the future.
The Gateway API represents the next generation of Kubernetes Ingress. It supports L7 routing of HTTP and gRPC traffic. It also supports L4 routing of TCP traffic. Gateway API aims to address both north-south traffic (to/from the Kubernetes cluster) and east-west traffic (within the Kubernetes cluster through a service mesh). For this proposal, the focus is north-south traffic routing and providing Apache Kafka access to Kafka clients running outside of the Kubernetes cluster.
Strimzi users have already been able to use Gateway API for some time.
They can, for example, use the type: cluster-ip listener designed for access within the Kubernetes cluster and manually manage the Gateway API resources and advertised hosts and ports.
More details can be found in the Accessing Kafka with Gateway API blog post on the Strimzi blog.
But with Gateway API emerging as the next Kubernetes routing standard, compatible implementations continuing to grow, and type: ingress being deprecated following the retirement of the Ingress NGINX Controller, it’s a natural step for Strimzi to add direct Gateway API support.
The APIs that are relevant to the Kafka protocol (TLSRoute and TCPRoute) are also finally maturing and moving to the Standard API channel.
Direct support will simplify the configuration of Gateway API resources, as Strimzi will manage them automatically and users will no longer need to do it manually.
It will also make it easier to combine Gateway API with other Strimzi features, such as horizontal autoscaling, which would otherwise be complicated (as users would need to create Gateway API resources when new brokers are added).
Strimzi will implement support for Gateway API using TLSRoute resources.
In the Strimzi API, this listener will be added as type: tlsroute.
TLSRoute resources support any TLS traffic and use TLS-SNI (client-specified hostname) to decide where to route traffic.
TLSRoute resources typically depend on a Gateway.
The Gateway represents the point of access, and TLSRoute resources tell the Gateway how and where traffic should be routed.
When using the type: tlsroute listener in Strimzi, the Strimzi Cluster Operator will be responsible for managing the TLSRoute resources.
But Strimzi will not be involved in Gateway management.
Users will bring their own Gateway and only reference it in the Strimzi configuration.
TLSRoute resources moved to the Standard API in version 1.4 of the Gateway API specification.
In version 1.5, the v1 version of the TLSRoute API was introduced.
While the v1 version is relatively new, it should provide a stable and future-proof API.
The v1 API is expected to be supported in version 7.7.0 of the Fabric8 Kubernetes Client as well.
This proposal suggests building tlsroute listener support on the v1 API.
The type: tlsroute listener will work on the same principle as the existing type: ingress and type: route listeners.
Strimzi will create the the bootstrap and per-broker services.
And then it will create the TLSRoute resources to route the data to these services and through them to the Kafka brokers.
The type: tlsroute API will mostly reuse existing fields already used for other listener types.
The only new field will be the list of parent references.
It will be part of the listener configuration and will be named parentRefs (the same name used by Gateway API itself).
Parent references are used in TLSRoute configuration and define which gateways will handle the TLS routes.
However, parent references might not always point directly to a gateway.
They might also point, for example, to a ListenerSet resource (which will further point to one or more Gateway resources).
In any case, one or more parent references have to be provided to Strimzi in order to configure the TLSRoute resources.
The Strimzi parent reference schema would use the same schema as in the Gateway API specification.
In addition to the list of parent references, the tlsroute listener will reuse fields already used by other listener types:
- The
hostandhostTemplatefields used to configure the hostname specified in theTLSRouteresources - The
advertisedHost,advertisedHostTemplate,advertisedPort, andadvertisedPortTemplatefields to control advertised addresses
The following YAML shows an example of the type: tlsroute listener configuration in a Kafka CR:
listeners:
- name: external
port: 9094
type: tlsroute
tls: true
authentication:
type: tls
configuration:
parentRefs:
- name: kafka-gateway
sectionName: kafka
bootstrap:
host: kafka.192.168.1.221.sslip.io
advertisedHostTemplate: kafka-{nodeId}-broker.192.168.1.221.sslip.io
hostTemplate: kafka-{nodeId}-broker.192.168.1.221.sslip.ioTLSRoute resources support TLS passthrough and TLS termination.
While the TLS mode applies to the TLSRoute resources, it is configured in the Gateway Listener TLS configuration in the Gateway or in ListenerSet resources.
TLS passthrough means that the TLS connection is forwarded to Strimzi as-is, including encryption.
In this case, the TLS connection will use the TLS certificate from the Kafka brokers and can use mTLS authentication as well.
TLS termination means that the TLS connection will be terminated in the gateway. The Kafka client will connect with encryption, but the connection will be decrypted by the Gateway. And from the Gateway it will continue in an unencrypted work to the Apache Kafka brokers. In this case, the connection will use the Gateway's TLS certificate, and mTLS authentication would not be available. From a Strimzi perspective, the connection will not use TLS encryption. Support for TLS termination mode is new, and I have not found any Gateway API implementation supporting it yet. But I expect it will eventually be supported.
To support both modes, Strimzi will support type: tlsroute listeners both with and without TLS encryption enabled in the Strimzi listener.
TLS encryption being enabled or disabled will not have any impact on how Strimzi configures the TLSRoute resources, as the TLs mode is configured in the Gateway.
It affects only whether Strimzi configures Kafka brokers to use TLS encryption.
mTLS authentication will be allowed only when TLS encryption is enabled.
The actual port used by TLSRoute resources is configured in the Gateway resource.
So Strimzi will not be aware of it.
Therefore, type: tlsroute listeners will use port 443 as the default advertised port, which is the same as for type: route and type: ingress listeners.
Where needed, users can use the advertisedPort configuration to override the advertised port.
Additionally, two new fields will be added to the Kafka CR template section (.spec.kafka.template):
externalBootstrapTLSRouteperBrokerTLSRoute
And one new field will be added to the KafkaNodePool CR template section (.spec.template):
perBrokerTLSRoute
These fields will be used for additional configuration of TLSRoute labels and annotations.
This mirrors the existing fields for OpenShift Routes and Ingress resources.
Similarly to Ingress resources or OpenShift Routes, TLSRoute resources always point to a Kubernetes Service and forward traffic to it.
The type: tlsroute listener will use the same architecture as these other listeners:
- One per-listener bootstrap service pointing to all Kafka brokers will be created for bootstrapping
- One bootstrap
TLSRoutewill be created and will point to the bootstrap service - One per-broker service will be created for each broker
- One per-broker
TLSRoutewill be created and will point to the corresponding per-broker service
The naming of these resources will follow the same rules as for the current listeners.
The created bootstrap TLSRoute resource will look like this:
apiVersion: gateway.networking.k8s.io/v1
kind: TLSRoute
metadata:
labels:
strimzi.io/cluster: my-cluster
strimzi.io/component-type: kafka
strimzi.io/kind: Kafka
strimzi.io/name: my-cluster-kafka
name: my-cluster-kafka-bootstrap
ownerReferences:
- apiVersion: kafka.strimzi.io/v1
blockOwnerDeletion: true
controller: false
kind: Kafka
name: my-cluster
uid: c111cc18-9056-4888-a426-c5c701b0ae90
spec:
hostnames:
- kafka.192.168.1.221.sslip.io
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: kafka-gateway
sectionName: kafka
rules:
- backendRefs:
- name: my-cluster-kafka-bootstrap
port: 9094The created per-broker TLSRoute resources will look like this:
apiVersion: gateway.networking.k8s.io/v1
kind: TLSRoute
metadata:
labels:
strimzi.io/cluster: my-cluster
strimzi.io/component-type: kafka
strimzi.io/kind: Kafka
strimzi.io/name: my-cluster-kafka
strimzi.io/pool-name: aston
name: my-cluster-aston-1000
ownerReferences:
- apiVersion: kafka.strimzi.io/v1
blockOwnerDeletion: true
controller: false
kind: KafkaNodePool
name: aston
uid: c111cc18-9056-4888-a426-c5c701b0ae90
spec:
hostnames:
- kafka-1000-broker.192.168.1.221.sslip.io
parentRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: kafka-gateway
sectionName: kafka
rules:
- backendRefs:
- name: my-cluster-aston-1000
port: 9094After creating the TLSRoute resources, Strimzi will wait for the .status section to be updated and contain at least one parent reference.
This indicates that the TLSRoute was accepted by the gateway.
Strimzi will not perform a detailed check of which gateway is among the parent references.
It will not check for other conditions or try to detect any other warnings, errors, or failures.
In case the TLSRoute does not have any parent references after the configured time limit (the Cluster Operator reconciliation timeout), the reconciliation will fail with a corresponding error.
This will help provide flexibility and simplify support for different implementations that handle conditions, warnings, and errors differently.
While system tests for type: tlsroute might be added in the future, they are not covered by this proposal.
We already have other listener types that rely fully on unit and manual testing only (e.g., Ingress).
The value of system tests is also diminished in scenarios with multiple different implementations that might behave differently.
Support for the experimental useDefaultGateways flag in the TLSRoute .spec section is out of scope for this proposal.
Support for it might be added in the future.
Another way to expose Apache Kafka with Gateway API could be TCPRoute resources.
TCP routes allow routing of TCP traffic without TLS encryption and without TLS-SNI.
The main advantage of TCP routes is that they allow users to freely choose whether they want to use TLS encryption.
However, TCP routes are harder to configure because, without TLS-SNI, every TCPRoute needs to have a unique address.
For example, a different gateway with a unique IP address, or a different port number.
So for a cluster with 10 brokers, you would need 11 different IP addresses or 11 different ports.
And you would need to configure each of them on a per-broker basis.
That is why this proposal chooses TLS routes as its basis and does not provide any support for TCP routes.
Integrating Strimzi directly with TCP routes is out of scope for this proposal. However, users with unique requirements who prefer using TCP routes can continue to use the manual configuration approach. It is also possible that future proposals will add direct support for TCP routes as well.
Any use of Gateway API for routing internal Kubernetes traffic or service mesh integration is out of scope and is not addressed by this proposal. It might, however, be addressed in a future proposal.
The exact process for migrating from the type: ingress listener is out of scope for this proposal as it depends on the specific Ingress and Gateway API implementation and on user's infrastructure.
Strimzi will not provide any documentation documenting such a migration.
However, users can always:
- Add a new
type: tlsroutelistener. - Reconfigure Kafka clients to use the new listener.
- Remove the old
type: ingresslistener.
This proposal affects only the Strimzi Cluster Operator.
This proposal is fully backwards compatible. The existing listeners are not affected in any way by the new listener.
N/A.