Skip to content

Ensure webhook availability during Konnectivity Agent rolling update #566

Open
@dippynark

Description

@dippynark

We are running Gatekeeper as a validating webhook on GKE (although I don't think the webhook implementation or cloud provider matters) and we have a test where we perform a rolling update of Gatekeeper while continuously making Kubernetes API requests that should be rejected to ensure requests/connections are being drained properly.

However, if we also delete Konnectivity Agent Pods while rolling Gatekeeper (gradually, to ensure that the Konnectivity Agent Pods aren't all down at the same time) or perform a rolling update (kubectl rollout restart deployment -n kube-system konnectivity-agent) then a few requests are allowed through (the ValidatingWebhookConfiguration is configured to fail open).

My question is whether this is an issue or whether Konnectivity Agent is behaving as expected? I guess this is happening because the long-lived HTTP keepalive connections between the Kubernetes API server and Gatekeeper (via Konnectivity Server and Konnectivity Agent) are being broken when Konnectivity Agent terminates and are not being drained properly (because there is no way for Konnectivity Agent to inspect the encrypted requests and disable keepalive before shutting down).

Should the Kubernetes API server be able to detect such TCP disconnects and retry validation after reconnecting?

Metadata

Metadata

Assignees

Labels

lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions