Skip to content

listing resources often gets 429 "storage is (re)initializing" errors in envtest #58

@orangecms

Description

@orangecms

Just for tracking/understading, here are the rough details now; I'll add a reproducer later.

I have a controller with a reconciler operating on CRDs from Envoy Gateway and Kubernetes Gateway.
Rust code is generated via Kopium.

CRDs stem from https://github.com/envoyproxy/gateway/releases/download/v1.6.4/envoy-gateway-crds.yaml and https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.4.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml (also others from k8s-gateway, just one example).

My reconciler fetches a list of some of the resources, and when I load the CRDs, then the list API calls would get me 429 errors.

Here is a helper that I use for the env setup; may make sense to get this here upstream, as I think it's a very common use case:

    /// Helper for test setup boilerplate.
    pub async fn setup_env_with_crds_and_namespaces(
        crds: &[CustomResourceDefinition],
        namespaces: Vec<String>,
    ) -> (Client, Server) {
        let env = Environment::default().with_crds(crds).unwrap();
        let server = env.create().await.unwrap();
        let client = server.client().unwrap();
        for ns in namespaces {
            create_ns(&ns, &client).await;
        }
        (client, server)
 }

Now I am doing this in my test case:

        let (client, _server) = setup_env_with_crds_and_namespaces(crds, namespaces).await;
        /* ... creating `settings` here, not important for repro, just for completeness ... */
        let ctx = Context { client, settings };
        reconcile(Arc::new(resource), Arc::new(ctx)).await.unwrap();

and reconcile() does this (regardless of listing full or metadata only):

    let list_params = ListParams::default();
    let api = kube::Api::<K>::all(client.clone());
    let res = api.list_metadata(&list_params).await?;

I get the following (formatted and unescaped for readability):

ApiError: storage is (re)initializing: TooManyRequests (Status {
  status: Some(Failure),
  code: 429,
  message: "storage is (re)initializing",
  metadata: Some(ListMeta {
    continue_: None,
    remaining_item_count: None,
    resource_version: None,
    self_link: None
  }),
  reason: "TooManyRequests",
  details: Some(StatusDetails {
    name: "",
    group: "",
    kind: "",
    uid: "",
    causes: [],
    retry_after_seconds: 1
  })
})

For the sake of the experiments, I added a timeout between the setup and running reconcile(), even a whole minute, to no avail.
A colleague of mine hinted on an event signalling readiness after CRD loading on the Go side, is that possibly something envtest will need to wait on / acknowledge somehow?

Note that I get this only for list API calls. Doing a patch() seems to work just fine, even when loading the same CRDs that I get the errors with when doing the list.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions