Skip to content

Kubernetes backend: Can't run hello world #999

Open
@knexator

Description

@knexator

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

Before submitting, please ensure you have gone through our
documentation.

Question

Both of these work correctly:
uv run torchx run --scheduler local_cwd -cfg queue=default --workspace="" utils.echo
uv run torchx run --scheduler local_docker -cfg queue=default --workspace="" utils.echo
However, with the kubernetes backend I get:

torchx 2025-01-20 12:35:27 INFO     Launched app: kubernetes://torchx/default:echo-rlkrc4nxcxzxs
torchx 2025-01-20 12:35:27 INFO     AppStatus:
    State: UNKNOWN
    Num Restarts: -1
    Roles:
    Msg: <NONE>
    Structured Error Msg: <NONE>
    UI URL: None

I get the same UNKNOWN by doing torchx status kubernetes://torchx/default:echo-rlkrc4nxcxzxs

I have run pip install torchx[kubernetes] and done kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v1.6.0/installer/volcano-development.yaml

I am using a local kind cluster, which is otherwise working correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions