-
Notifications
You must be signed in to change notification settings - Fork 329
Description
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
- Gateway Controller Processes Non-Consul Gateways & Deletes Resources Without Provenance Validation.
The controller reconciles every Gateway resource in the cluster, regardless of gateway class.
consul-k8s/control-plane/api-gateway/controllers/gateway_controller.go
Lines 432 to 433 in e8fac3f
| return c, cleaner, ctrl.NewControllerManagedBy(mgr). | |
| For(&gwv1beta1.Gateway{}). |
- Intentional Processing of Non-Consul Gateways
The code intentionally processes gateways not controlled by consul with explicit comments justifying this dangerous behavior.
consul-k8s/control-plane/api-gateway/controllers/gateway_controller.go
Lines 145 to 147 in e8fac3f
| // add our current gateway even if it's not controlled by us so we | |
| // can garbage collect any resources for it. | |
| resources.ReferenceCountGateway(gateway) |
- Dangerous Deletion Logic Flow
When a non-consul gateway is processed:
consul-k8s/control-plane/api-gateway/binding/binder.go
Lines 98 to 105 in e8fac3f
| // isGatewayDeleted returns whether we should treat the given gateway as a deleted object. | |
| // This is true if the gateway has a deleted timestamp, if its GatewayClass does not match | |
| // our controller name, or if the GatewayClass it references doesn't exist. | |
| func (b *Binder) isGatewayDeleted() bool { | |
| gatewayClassMismatch := b.config.GatewayClass == nil || b.config.ControllerName != string(b.config.GatewayClass.Spec.ControllerName) | |
| isGatewayDeleted := isDeleted(&b.config.Gateway) || gatewayClassMismatch || b.config.GatewayClassConfig == nil | |
| return isGatewayDeleted | |
| } |
The gateway is treated as "Deleted", and then a K8s resource is deleted.
Gatekeeper deletes resources based on name and namespace check only - no provenance validation, here is the example how Deployment is being deleted:
consul-k8s/control-plane/api-gateway/gatekeeper/deployment.go
Lines 81 to 85 in e8fac3f
| func (g *Gatekeeper) deleteDeployment(ctx context.Context, gwName types.NamespacedName) error { | |
| err := g.Client.Delete(ctx, &appsv1.Deployment{ObjectMeta: metav1.ObjectMeta{Name: gwName.Name, Namespace: gwName.Namespace}}) | |
| if k8serrors.IsNotFound(err) { | |
| return nil | |
| } |
Other k8s resources, which do not "belong" to this controller are also deleted in a similar manner.
Reproduction Steps
- Deploy any non-consul gateway controller (kgateway)
- Create a Gateway with their gateway class
- Deploy consul-k8s with API Gateway controller
- Watch consul delete resources it didn't create
Logs
2025-10-13T22:05:35.661Z DEBUG Reconciling Gateway {"gateway": {"name":"http","namespace":"kgateway-system"}}
2025-10-13T22:05:35.965Z DEBUG controllers.GatewayClass Reconciling GatewayClass {"gatewayClass": "kgateway"}
2025-10-13T22:05:36.062Z DEBUG controllers.GatewayClass Reconciling GatewayClass {"gatewayClass": "kgateway"}
2025-10-13T22:05:36.161Z DEBUG controllers.GatewayClass Reconciling GatewayClass {"gatewayClass": "kgateway"}
2025-10-13T22:05:36.162Z DEBUG deleting from Consul {"gateway": {"name":"http","namespace":"kgateway-system"}, "kind": "api-gateway", "namespace": "", "name": "http"}
Expected behavior
- Gateway Class Filtering
- Controller should ONLY reconcile gateways with gatewayClassName referencing a GatewayClass controlled by
consul.hashicorp.com/gateway-controller - Non-consul gateways should be ignored completely - no reconciliation, no processing, no log messages
- Safe Resource Deletion
- Before deleting any Kubernetes resource, controller must validate ownership:
- Check for consul-specific labels:
gateway.consul.hashicorp.com/managed: "true" - Verify consul annotations:
consul.hashicorp.com/gateway-kind: "api-gateway" - Validate owner references pointing to the consul-managed Gateway
- Skip deletion if resource wasn't created by consul with clear logging: "Skipping deletion - resource not managed by consul"
Environment details
consul-k8sversion: 1.8.3values.yamlused to deploy the helm chart:
# Values for Consul Helm chart for the primary federated datacenter "wc"
global:
name: consul
datacenter: wc
# Configure ACLs for the Consul cluster.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-global-acls
acls:
manageSystemACLs: true
# If ACLs are enabled, we must create a token for secondary
# datacenters to replicate ACLs.
createReplicationToken: true
apiGateway:
manageExternalCRDs: false
# Enables WAN federation for this datacenter.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-global-federation
federation:
enabled: true
#! primaryDatacenter: wc
# This will cause a Kubernetes secret to be created that
# can be imported by secondary datacenters to configure them
# for federation.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-global-federation-createfederationsecret
createFederationSecret: true
# Configures gossip encryption for the Consul cluster.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-global-gossipencryption
gossipEncryption:
# Automatically generate a gossip encryption key and save it to a Kubernetes or Vault secret.
autoGenerate: true
# Enables TLS across the cluster to verify authenticity of the Consul servers and clients.
# This is not the same CA as service mesh CA for service-to-service communication, which is
# enabled with the `connectInject` option below.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-global-tls
tls:
enabled: true
# The Consul CA root certificate from the ca-consul-server secret.
caCert:
secretName: tls-ca
secretKey: tls.crt
caKey:
secretName: tls-ca
secretKey: tls.key
# Mesh gateways are gateways between datacenters. They must be enabled
# for federation in Kubernetes since the communication between datacenters
# goes through the mesh gateways.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-meshgateway
meshGateway:
enabled: true
# Configuration for Consul servers.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#v-server
server:
replicas: 1
bootstrapExpect: 1
connect: true
# Server certificate and key from the server-cert secret.
# This certificate is issued by the tls_ca CA out of band
serverCert:
secretName: "server-cert"
secretKey: "tls.crt"
serverKey:
secretName: "server-cert"
secretKey: "tls.key"
# This should mount the connect-ca-config secret created by ExternalSecret
# as a volume in the Consul server pods under. This provides the Consul servers
# with the CA cert and key to sign service mesh certificates.
extraVolumes:
# Mounts /consul/userconfig/connect-ca-config/connect_config.json
- name: connect-ca-config
type: secret
load: true
# Configures the automatic Connect sidecar injector.
# See: https://developer.hashicorp.com/consul/docs/reference/k8s/helm#h-connectinject
connectInject:
enabled: true
default: false
# Enable central config to allow auth method creation
centralConfig:
enabled: true
# Enable webhook to ensure proper initialization
webhook:
failurePolicy: "Ignore"
apiGateway:
# Disable the Gateway API CRDs since we are managing them externally via gwapi.
manageExternalCRDs: false
k8sDenyNamespaces: ['kgateway-system']
namespaceSelector: |
matchExpressions:
- key: "kubernetes.io/metadata.name"
operator: "NotIn"
values: ["kube-system","local-path-storage","openebs","gmp-system","gke-managed-cim", "argocd","kgateway-system"]
- Kubernetes version: v1.32.x
- Cloud Provider VMWare
- Networking CNI plugin in use: Cilium
Additional Context
This appears to be an architectural design flaw rather than an oversight. The intentional processing of non-consul gateways for "garbage collection" creates a fundamental safety violation in Kubernetes controller patterns.