Skip to content

[TEST] when running e2e tests on local, when the katib is not available - it should exit without continuting to run tests #2513

Open
@mahdikhashan

Description

@mahdikhashan

What happened?

trace for e2e test run

Katib deployments
No resources found in kubeflow namespace.
Katib services
No resources found in kubeflow namespace.
Katib pods
No resources found in kubeflow namespace.
Katib persistent volume claims
No resources found in kubeflow namespace.
Available CRDs
No resources found
DEBUG:kubernetes.client.rest:response body: {"kind":"Namespace","apiVersion":"v1","metadata":{"name":"default","uid":"5005d5bb-680b-4336-a08c-6ff027d45d27","resourceVersion":"38","creationTimestamp":"2025-02-13T09:50:30Z","labels":{"kubernetes.io/metadata.name":"default"},"managedFields":[{"manager":"kube-apiserver","operation":"Update","apiVersion":"v1","time":"2025-02-13T09:50:30Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:kubernetes.io/metadata.name":{}}}}}]},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}

DEBUG:kubernetes.client.rest:response body: {"kind":"Namespace","apiVersion":"v1","metadata":{"name":"default","uid":"5005d5bb-680b-4336-a08c-6ff027d45d27","resourceVersion":"662","creationTimestamp":"2025-02-13T09:50:30Z","labels":{"katib.kubeflow.org/metrics-collector-injection":"enabled","kubernetes.io/metadata.name":"default"},"managedFields":[{"manager":"kube-apiserver","operation":"Update","apiVersion":"v1","time":"2025-02-13T09:50:30Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{".":{},"f:kubernetes.io/metadata.name":{}}}}},{"manager":"OpenAPI-Generator","operation":"Update","apiVersion":"v1","time":"2025-02-13T09:53:40Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:labels":{"f:katib.kubeflow.org/metrics-collector-injection":{}}}}}]},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}

DEBUG:root:Creating Experiment: default/tune-example
DEBUG:kubernetes.client.rest:response body: 404 page not found

INFO:root:---------------------------------------------------------------
INFO:root:E2E is failed for Experiment created by tune: default/tune-example
INFO:root:---------------------------------------------------------------
INFO:root:---------------------------------------------------------------
DEBUG:kubernetes.client.rest:response body: 404 page not found

Traceback (most recent call last):
  File "/Users/mahdikhashan/kubeflow/katib/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py", line 136, in create_experiment
    outputs = self.custom_api.create_namespaced_custom_object(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api/custom_objects_api.py", line 231, in create_namespaced_custom_object
    return self.create_namespaced_custom_object_with_http_info(group, version, namespace, plural, body, **kwargs)  # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api/custom_objects_api.py", line 354, in create_namespaced_custom_object_with_http_info
    return self.api_client.call_api(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
                    ^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 391, in request
    return self.rest_client.POST(url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/rest.py", line 279, in POST
    return self.request("POST", url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': '00a7b76c-3bac-4a36-956b-955d4fb24dab', 'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '95302837-9f22-4b63-b535-6520429dc684', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f4fcc431-38ce-4425-a09a-6e0f036a03c5', 'Date': 'Thu, 13 Feb 2025 09:53:40 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mahdikhashan/kubeflow/katib/test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.py", line 163, in <module>
    raise e
  File "/Users/mahdikhashan/kubeflow/katib/test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.py", line 153, in <module>
    run_e2e_experiment_create_by_tune(katib_client, exp_name, exp_namespace)
  File "/Users/mahdikhashan/kubeflow/katib/test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.py", line 39, in run_e2e_experiment_create_by_tune
    katib_client.tune(
  File "/Users/mahdikhashan/kubeflow/katib/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py", line 706, in tune
    self.create_experiment(experiment, namespace)
  File "/Users/mahdikhashan/kubeflow/katib/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py", line 156, in create_experiment
    raise RuntimeError(
RuntimeError: Failed to create Katib Experiment: default/tune-example

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mahdikhashan/kubeflow/katib/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py", line 1214, in delete_experiment
    self.custom_api.delete_namespaced_custom_object(
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api/custom_objects_api.py", line 916, in delete_namespaced_custom_object
    return self.delete_namespaced_custom_object_with_http_info(group, version, namespace, plural, name, **kwargs)  # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api/custom_objects_api.py", line 1043, in delete_namespaced_custom_object_with_http_info
    return self.api_client.call_api(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
                    ^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 415, in request
    return self.rest_client.DELETE(url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/rest.py", line 270, in DELETE
    return self.request("DELETE", url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mahdikhashan/kubeflow/katib/venv/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Audit-Id': '22dc585f-d9c7-4ad0-a985-5e53f85b48dc', 'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '95302837-9f22-4b63-b535-6520429dc684', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f4fcc431-38ce-4425-a09a-6e0f036a03c5', 'Date': 'Thu, 13 Feb 2025 09:53:40 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mahdikhashan/kubeflow/katib/test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.py", line 168, in <module>
    katib_client.delete_experiment(exp_name, exp_namespace)
  File "/Users/mahdikhashan/kubeflow/katib/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py", line 1227, in delete_experiment
    raise RuntimeError(f"Failed to delete Katib Experiment: {namespace}/{name}")
RuntimeError: Failed to delete Katib Experiment: default/tune-example
No resources found in kubeflow namespace.

What did you expect to happen?

to exit when katib is not available

Environment

Kubernetes version:

$ kubectl version

Katib controller version:

$ kubectl get pods -n kubeflow -l katib.kubeflow.org/component=controller -o jsonpath="{.items[*].spec.containers[*].image}"

Katib Python SDK version:

$ pip show kubeflow-katib

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

Activity

Electronic-Waste

Electronic-Waste commented on Feb 16, 2025

@Electronic-Waste
Member

/remove-label lifecycle/needs-triage
/area testing

Rahul-Kumar-prog

Rahul-Kumar-prog commented on Mar 2, 2025

@Rahul-Kumar-prog

/assign

mahdikhashan

mahdikhashan commented on Mar 2, 2025

@mahdikhashan
MemberAuthor

/assign

I would say the first step is to reproduce this error, or make the e2e tests fail.

you may have a k8s cluster on your local, whether KinD or minikube or with Docker Desktop k8s engine.

then with a proper python env, run this shell script https://github.com/kubeflow/katib/blob/master/test/e2e/v1beta1/scripts/gh-actions/run-e2e-tune-api.sh.

the goal here is to stop running the script when the katib is not accessible,

p.s you may refer to this file for ideas: test/e2e/v1beta1/scripts/gh-actions/run-e2e-experiment.sh

mahdikhashan

mahdikhashan commented on Mar 17, 2025

@mahdikhashan
MemberAuthor

hey @Rahul-Kumar-prog , any progress on the issue? in any case feel free to open a draft pr and we can work together on it.

Rahul-Kumar-prog

Rahul-Kumar-prog commented on Mar 19, 2025

@Rahul-Kumar-prog

Yeah, I will open a draft pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    [TEST] when running e2e tests on local, when the katib is not available - it should exit without continuting to run tests · Issue #2513 · kubeflow/katib