Skip to content

KEP-6060: API Server Authentication to Webhooks#6156

Open
pmengelbert wants to merge 5 commits into
kubernetes:masterfrom
pmengelbert:pmengelbert/kep-6060-api-server-authentication-to-webhooks/1
Open

KEP-6060: API Server Authentication to Webhooks#6156
pmengelbert wants to merge 5 commits into
kubernetes:masterfrom
pmengelbert:pmengelbert/kep-6060-api-server-authentication-to-webhooks/1

Conversation

@pmengelbert

@pmengelbert pmengelbert commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 4, 2026
@k8s-ci-robot k8s-ci-robot requested a review from aramase June 4, 2026 16:04
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jun 4, 2026
@k8s-ci-robot k8s-ci-robot requested a review from ritazh June 4, 2026 16:04
@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 4, 2026
@enj enj added this to SIG Auth Jun 5, 2026
@enj enj moved this to Needs Triage in SIG Auth Jun 5, 2026
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch from c22d99e to db49bad Compare June 9, 2026 18:17
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 11, 2026
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch from dbd42c7 to fd71104 Compare June 11, 2026 16:07
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pmengelbert
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 11, 2026
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch from fd71104 to efea7f5 Compare June 11, 2026 16:08
@enj enj changed the title [WIP] KEP-6060 PRR Questionnaire KEP-6060: API Server Authentication to Webhooks Jun 11, 2026
@enj enj moved this from Needs Triage to In Review in SIG Auth Jun 11, 2026
@enj enj requested review from deads2k and liggitt and removed request for aramase and ritazh June 11, 2026 16:16
@enj enj added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 11, 2026
is distinct from the service account the the principal requesting the token
might be using to authenticate itself to the Kubernetes API Server. The
Token Acquisition Service Account must have `attest` permissions on the
`APIService` object named in the `TokenRequest`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically, tokens only identified, and authorization checks were done by whatever received the token. This is proposing doing the authorization check at token issuance time so the webhook merely has to look for a particular claim in the token?

If we're doing an authorization check on the service account at the time of issuance, would making the bound object and authorization check be related to the Service / ValidatingWebhookConfiguration / MutatingWebhookConfiguration / CustomResourceDefinition the token is going to be used against make more sense than an APIService? That would let us verify the requested audience was coherent with it as well.

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is proposing doing the authorization check at token issuance time so the webhook merely has to look for a particular claim in the token?

Yes. In order to avoid the webhook talking back to kube-apiserver for TokenReview or SubjectAccessReview. This is all done to prevent aggregated API servers from probing policy information by asking questions about resources it does not control.

The purpose of using the APIService as the bound object is to make it possible for the Kubernetes API Server to check whether the webhook token acquisition service account has permission to talk to the webhook, and whether it may ask do so in order to ask questions about this APIService.

With the {Validating,Mutating}WebhookConfiguration as the bound object, I can't think of a way to perform the same authorization check as with the APIService, without the webhook having to call back to kube-apiserver. At token issuance, kube-apiserver does not know what question is being asked; it knows the audience and the bound object.

Because it is proposed that the audience be derived from the webhook config, the audience from the TokenRequest is used to correlate the APIService (from the bound object) and the webhook config corresponding to the audience. From there, it can be determined whether or not the webhook is even relevant to that APIService. Furthermore, kube-apiserver can check whether or not the token acquisition service account has the requisite permissions to ask questions about resources in that APIService. Without the bound APIService, kube-apiserver cannot do that check at credential issuance (it doesn't know the question being asked). We can't assume that the webhook author will have knowledge of which API Services it must support, so the webhook must instead answer the question (of whether or not the principal may ask questions about resources in a particular APIService) by making a SAR request to kube-apiserver. We want to avoid the webhook calling back to the Kubernetes API server for SAR and TokenReview, for obvious reasons.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing further with @enj, we propose the following:

Support 3 types of bindings:

  • MutatingWebhookConfiguration
  • ValidatingWebhookConfiguration
  • APIService

When either of the webhook configurations is used as the bound object, the permissions required for token issuance are "attest" on "*", because in essence that is what is being permitted by such a token.

Tokens bound to APIService won't require such broad permission, and will be narrowly scoped to that APIService.

How does this strike you?

1. Verify the token's signature via the OIDC discovery endpoint.
1. Verify that the token's audience matches the expected audience. This audience
is derived deterministically from the webhook name, and is in the format is
in the format `k8s.io:admission:<webhook-name>`, where `<webhook-name>`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the unusual prefix? Isn't audience typically the host / URL where the token will be presented?

Is there a reason not to use https://$url/with/path or https://$servicename.$servicenamespace.svc:port/with/path as the audience? Webhooks already are required to have valid serving certificates for those hosts, so no new knowledge should be required to make the webhooks able to validate a similar audience in the token.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is that the webhook author may not know the URL where the webhook is deployed, and there will be more plumbing needed on the webhook side to make that information available to the token verification routine.

That said, we don't want this to be a sticking point, so we'll accept your suggestion if you don't agree with the above reasoning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audience as a URL is more typical ... and since webhooks already are required to have valid serving certificates for their host, it seems more likely to me they can know or be told the host/URL more easily

Comment on lines +225 to +227
1. Verify that the `APIGroup` and `APIVersion` encoded in the token's bound
APIService match the `APIGroup` and `Version` of the resource in the body
of the `AdmissionReview` request.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean a server that serves lots of different API groups/versions (like kube-apiserver) combined with a webhook that intercepts lots of different groups/versions (as a generic label protection or something) would need a distinct token for every API group/version type it sent to the webhook? That's not ideal.

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an unfortunate consequence.

In practice, aggregated API Servers will typically have one, and almost always fewer than five, relevant APIService objects. kube-apiserver is a distinct case (could have thousands of APIServices), and I share your concern about the explosion of tokens there.

We could permit kube-apiserver to use a "wildcard" APIService as the bound object. During bootstrapping, the requisite permissions (i.e. "attest" on "*" or an equivalent rule) would be set up for kube-apiserver's token acquisition service account. Then kube-apiserver only has to maintain one token per webhook, while aggregated API servers should be able to manage with one per webhook/apiservice combination.

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, rather than using the wildcard APIService, we think it's better to allow bindings to either an APIService or one of the two WebhookConfiguration types (see comment linked below):

#6156 (comment)

in the format `k8s.io:admission:<webhook-name>`, where `<webhook-name>`
is the "inner" name of the webhook (i.e. the name in the inner list
of webhooks).
1. Verify that the `APIGroup` and `APIVersion` encoded in the token's bound

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A request to $group/v2 of an API can be converted to $group/v1 and sent to a webhook that asked to intercept $group/v1. See https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-matchpolicy

The object sent to the webhook would be in $group/v1, the kind and resource in the AdmissionReview would be $group/v1 and the requestKind and requestResource in the AdmissionReview would be $group/v2.

What APIService / version would you expect to be in the token sent to a webhook for that?

(I commented elsewhere about the implications of tying the token to a particular presented $group/$version ... we may want to tie it to the target webhook or service)

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case:

  • Aggregated API Servers (likely) don't need to worry about this, since they don't serve custom resources. Any conversion (or request to convert) will have to be done by the aggregated API server itself, so it can just request the token post-conversion.
  • kube-apiserver won't need to worry about this either, if it can request (as proposed in my comment above) tokens bound to one of the WebhookConfiguration types.

Alternatively, we can ignore the APIVersion and decide that we only care about the APIGroup. I don't think it's ever particularly relevant.

If we do decide we care about the version, then this is covered by the webhook: it will check whether or not the (APIGroup, APIVersion) match between those stated in the token and those from the AdmissionReview request body.

for this use. To distinguish between ServiceAccount tokens used for other
purposes, the term **Webhook Authentication Token (WAT)** will be used. However,
it is important to understand that these are ServiceAccount tokens in every
sense; but their use is constrained by newly added private claims.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me, but I found defining new terms / acronyms for things we already have a bit confusing.

Also, be careful with language like "constrained" unless you really mean these tokens would be invalid to use as service account tokens in other ways because of these claims (which would make them sort of not normal service account tokens)

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me, but I found defining new terms / acronyms for things we already have a bit confusing.

Not my intention to confuse. It helped with the writing to have those terms laid out, so that I have a consistent way to refer to a specific use of something in a particular context. It also helped with conciseness (ex. "token acquisition service account", vs "the service account token named in the TokenRequest for a token to authenticate to webhooks"). The latter is hard to work into more complex sentences.

Also, be careful with language like "constrained" [. . .]

Noted. I'll give this another pass.

Comment on lines +233 to +239
When a [webhook authentication client](#webhook-authentication-client) needs
to call an admission webhook about a given resource, it issues a `TokenRequest`
for its [webhook token acquisition service account](#webhook-token-acquisition-service-account)
to the Kubernetes API Server. The request includes:

1. A `BoundObjectRef` pointing to the APIService corresponding to the resource
being admitted (e.g., `v1.networking.k8s.io`).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to have to mint a token for every webhook invocation, a token should be able to be retained and renewed/rotated after some percentage of its lifetime is used. I was expecting each client to only have to maintain one token per webhook, not one token per resource $group/$version per webhook.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to have to mint a token for every webhook invocation, a token should be able to be retained and renewed/rotated after some percentage of its lifetime is used.

Yes, the tokens will be cached until their expiration, at which time they will be refreshed. We're in agreement on that. I'll make sure it gets mentioned here.

In the case of `kube-apiserver`, the [webhook token acquisition service
account](#webhook-token-acqcuisition-service-account) will be a discoverable service
account automatically created in the boostrapping process. The name will be
randomized to discourage its abuse by other webhook authentication clients.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't randomize built-in service account names for other things ... this seems more confusing than helpful

Rather than randomizing the kube-apiserver serviceaccount, let's just make sure we have workable user stories / examples / documentation / defaults for how:

  1. aggregated servers configure their own distinct service account
  2. permissions are granted to the server's service account (what does the server admin do, what does the webhook configuration author do, what knowledge does each of those actors have, how does that combine to route permissions to the right spots?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I don't think we can really do anything to prevent people from abusing it, so there's no need to complicate things. I'll update the user stories to include these.


A user creates a Pod. The kube-apiserver needs to consult a validating
admission webhook. It requests a WAT from itself for its dedicated service
account, bound to APIService `v1.networking.k8s.io` with an audience derived from the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's presenting a pod (Group: "", Version: "v1"), why is it getting a token bound to APIService v1.networking.k8s.io?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mistake, thanks for catching. I meant to use a different resource (to avoid the weirdness of having an empty string APIGroup in the example).

Comment on lines +263 to +273
When an aggregated API server needs to call an admission webhook, it requests
a WAT from the Kubernetes API Server. Each aggregated API server should
have a dedicated service account for this purpose, as it must be named in
the token request. The request flow is:

1. The aggregated API server authenticates to the kube-apiserver using
whatever credential it is configured with. That principal must be authorized
to `create serviceaccount/token` in the relevant namespace.
2. It sends a `TokenRequest` for its dedicated service account, with a
`BoundObjectRef` pointing to the APIService it serves (e.g.,
`v1.example.com`) and the appropriate audience.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment about token reuse and wanting to maintain one reusable token per client per webhook, not per resource $group/$version

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be resolved if we are in agreement on this:

#6156 (comment)

Comment on lines +301 to +302
ClusterTrustBundle signer attestation). To illustrate the permission model,
the following RBAC configuration is given as an example. To paraphrase Donald

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make sure it's possible to set up permissions correctly given the knowledge of these two personas:

  • webhook author / webhook config manifest author who knows the things they are intercepting, but not server identities
  • aggregated server admin who knows the identity they are giving the server, but not the webhooks that will need to be called by the server

@pmengelbert pmengelbert Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by:
#6156 (comment)

Basically, webhook config manifest authors can set up a service account with "attest" on the "*" APIService (or equivalent authz check); that allows TokenRequests for that service account to be bound to a MutatingWebhookConfiguration or a ValidatingWebhookConfiguration.

Aggregated API Server admins will set up a service account with "attest" on the relevant APIService (or equivalent).

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 15, 2026
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch 2 times, most recently from 80f24d9 to 7576c3e Compare June 17, 2026 14:17
@benjaminapetersen benjaminapetersen removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 17, 2026
@benjaminapetersen

Copy link
Copy Markdown
Member

/hold

@pmengelbert lost me in the commit history 😄

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 17, 2026
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch 2 times, most recently from 2e5c5b9 to 4d4f686 Compare June 17, 2026 16:43
pmengelbert and others added 4 commits June 17, 2026 13:10
- Also did a cursory filling out of kep.yaml

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
These updates are still WIP, to be completed shortly.

Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
Signed-off-by: Ben Petersen <admin@benjaminapetersen.me>
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch from be23593 to 4258856 Compare June 17, 2026 17:14

1. Verify the token's signature via the OIDC discovery endpoint.
1. Verify that the token's audience matches the expected audience. This audience
is derived deterministically from the webhook url, and is in the format

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit --- this is doubled up "is in the format is in the format"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This and other things have since been updated. The version I just pushed is much closer to final.

1. Verify the token's signature via the OIDC discovery endpoint.
1. Verify that the token's audience matches the expected audience. This audience
is derived deterministically from the webhook url, and is in the format
is in the format `https://<url>/with/path`, where `<url>` matches that

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a good default, I think there are probably setups where the webhook backend expects a different audience? It might need to be configurable in the Mutating/ValidatingAdmissionWebhook object.

In general, it's the relying party that's in control of which audience value(s) are expected.

#### `kube-apiserver`:
In the case of `kube-apiserver`, the [token acquisition service
account](#token-acqcuisition-service-account) will be a service with a
well-known name, `kube-system:webhook-auth`, which is automatically created

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we give kube-apiserver one identity that it can use for all purposes (contacting webhooks, contacting peer replicas, uploading metrics, etc).

If that needs to be a service account, then that's fine, but it should probably be something like system:serviceaccount:kube-system:kube-apiserver.

@pmengelbert pmengelbert marked this pull request as ready for review June 17, 2026 21:31
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2026
Signed-off-by: Peter Engelbert <pmengelbert@gmail.com>
@pmengelbert pmengelbert force-pushed the pmengelbert/kep-6060-api-server-authentication-to-webhooks/1 branch from bf2bc22 to a82528c Compare June 17, 2026 21:37
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@pmengelbert: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-verify a82528c link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

6 participants