Skip to content

external metric exporter returns empty resource list returned on /apis discovery #6694

Open
@FrancoisPoinsot

Description

@FrancoisPoinsot

Report

Versions

Keda 2.16.1
Kubectl Client Version: v1.32.3
K8s Server Version: v1.31.6-gke.1020000

The problem

Using kubectl, it seems the discovery done on /api or /apis is never cached.
I am talking about the cache on disk. Not the in-memory cache.

Expected Behavior

Running this twice, you should not see a call to /apis the second time.

kubectl --context test-francois  -v=10  get crd 2>&1 >/dev/null  | grep "GET https"

Example:

❯ kubectl --context test-francois  -v=10  get crd 2>&1 >/dev/null  | grep "GET https"
I0407 21:45:13.439049   64728 round_trippers.go:560] GET https://104.155.52.149/api?timeout=32s 200 OK in 294 milliseconds
I0407 21:45:13.487568   64728 round_trippers.go:560] GET https://104.155.52.149/apis?timeout=32s 200 OK in 42 milliseconds
I0407 21:45:13.712453   64728 round_trippers.go:560] GET https://104.155.52.149/apis/apiextensions.k8s.io/v1/customresourcedefinitions?limit=500 200 OK in 118 milliseconds                                                                                                                                                                                                                  

❯ kubectl --context test-francois  -v=10  get crd 2>&1 >/dev/null  | grep "GET https"
I0407 21:45:14.659998   64741 round_trippers.go:560] GET https://104.155.52.149/apis/apiextensions.k8s.io/v1/customresourcedefinitions?limit=500 200 OK in 199 milliseconds

That's what I expect. And this is indeed what I see on a cluster not running keda.

Actual Behavior

But if the k8s cluster is running keda's external.metric provider, you always see both /api and /apis call.

kubectl --context another-cluster-with-keda  -v=10  get crd 2>&1 >/dev/null  | grep "GET https"

I0407 22:23:31.662794    3488 round_trippers.go:560] GET https://34.76.140.92/api?timeout=32s 200 OK in 512 milliseconds
I0407 22:23:31.718439    3488 round_trippers.go:560] GET https://34.76.140.92/apis?timeout=32s 200 OK in 52 milliseconds
I0407 22:23:32.005074    3488 round_trippers.go:560] GET https://34.76.140.92/apis/apiextensions.k8s.io/v1/customresourcedefinitions?limit=500 200 OK in 143 milliseconds

❯ kubectl --context another-cluster-with-keda  -v=10  get crd 2>&1 >/dev/null  | grep "GET https"

I0407 22:23:33.261842    3536 round_trippers.go:560] GET https://34.76.140.92/api?timeout=32s 200 OK in 218 milliseconds
I0407 22:23:33.334677    3536 round_trippers.go:560] GET https://34.76.140.92/apis?timeout=32s 200 OK in 69 milliseconds
I0407 22:23:33.502721    3536 round_trippers.go:560] GET https://34.76.140.92/apis/apiextensions.k8s.io/v1/customresourcedefinitions?limit=500 200 OK in 148 milliseconds

Steps to Reproduce the Problem

  1. get a fresh cluster
  2. install keda's external metric adapter
  3. run any kubectl get command twice

witness discovery not being cached

Logs from KEDA operator

not related to the operator

KEDA Version

2.16.1

Kubernetes Version

1.32

Platform

Any

Scaler Details

not related to scaler

Anything else?

Impact

This is arguably not a big problem. This is not the same thing as this issue.

The in memory cache is still working. It only affects the disk cache.
So I think I understand it mostly affect the kubectl client, and not all other operators clients which rarely restarts.
It just means every kubectl commands are a bit slower.

Some explanation

I started looking a bit into it. I don't have a clear answer yet. So I will dump here I found so far.

  1. Here is the implementation that prevents the disk cache from writting to the disk when APIresources is empty: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/cached/disk/cached_discovery.go#L87

And here is the related PR justifying this: kubernetes/kubernetes#42267
So I understand from this that "having a list of resources" is the intended way to figure if the response is valid.

You can test that if you comment that block and build/run kubectl, with the command above, the request is cached as expected.
So that is why I am sure this empty response is the trigger behind this cache issue.

  1. Weirdly enough that call returns some resources.
 kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/"
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[{"name":"externalmetrics","singularName":"","namespaced":true,"kind":"ExternalMetricValueList","verbs":["get"]}]}

Note in case it was not clear until now, the resources I am talking about are the list of kind under the group.
But indeed, if I dump what is returned in /api, I see no resources declared for external.metrics.k8s.io
Abreviated dump:

{
    "kind": "APIGroupDiscoveryList",
    "apiVersion": "apidiscovery.k8s.io/v2",
    "metadata": {},
    "items": [
        {
            "metadata": {
                "name": "external.metrics.k8s.io",
                "creationTimestamp": null
            },
            "versions": [
                {
                    "version": "v1beta1",
                    "freshness": "Current"
                }
            ]
        },
        {
            "metadata": {
                "name": "metrics.k8s.io",
                "creationTimestamp": null
            },
            "versions": [
                {
                    "version": "v1beta1",
                    "resources": [
                        {
                            "resource": "nodes",
                            "responseKind": {
                                "group": "",
                                "version": "",
                                "kind": "NodeMetrics"
                            },
                            "scope": "Cluster",
                            "singularResource": "",
                            "verbs": [
                                "get",
                                "list"
                            ]
                        },
                        {
                            "resource": "pods",
                            "responseKind": {
                                "group": "",
                                "version": "",
                                "kind": "PodMetrics"
                            },
                            "scope": "Namespaced",
                            "singularResource": "",
                            "verbs": [
                                "get",
                                "list"
                            ]
                        }
                    ],
                    "freshness": "Current"
                }
            ]
        }
    ]
}
  1. I understand that keda external metric server is essentially using custom-metric-apiserver
    I looked a bit at it. And the function used to list resource seems fine: https://github.com/kubernetes-sigs/custom-metrics-apiserver/blob/1ff5f1c1962f3b74c66a2c55a9621752a6c0547d/pkg/provider/resource_lister.go#L63

So there is a fair chance that the issue I am looking for is either:

  • between custom-metrics-apiserver and discovery.APIResourceLister. The later declares the routes
  • or a misconfiguration of the custom-metrics-apiserver from keda?
  • or maybe there is no bug, I just misunderstood the expectations

What I am asking

  1. I would like a confirmation that am correct about the expectation that api discovery should be cached between kubectl calls

  2. I will continue looking a bit into that but If someone already has some thoughts about the issue, I would gladely hear about that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    To Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions