[dagster-k8s] Init container failures cause Pipes to hang

### What's the issue?

If you are launching a pod with init containers via PipesK8sClient.run, and the pod encounters an error in the init container, pipes will wait indefinitely (or, until the default 1 day timeout) for the pod to be ready, even though it never will be.

### What did you expect to happen?

I expect Dagster to raise a `DagsterK8sError` and fail the job.

### How to reproduce?

Create a pod spec with an `init_container` that exits `1`. Launch that pod via `PipesK8sClient.run`. When the pod gets status `Init:Error`, notice that the Dagster run will continue without noticing the error.

### Dagster version

1.10.11

### Deployment type

Dagster Helm chart

### Deployment details

_No response_

### Additional information

In the PR https://github.com/dagster-io/dagster/pull/24313 I added lots of code for error handling for K8S pods on the Pipes side. We can see it handles recoverable errors and waits for retry, and raises `DagsterK8sError` for unrecoverable errors such as `RunContainerError`, `ErrImagePull`, etc.

However, if the init container exits with `Init:Error`, Dagster doesn't pick up on that unrecoverable error and does not kill the pod.

Kubernetes docs _do not_ do a good job of publishing all possible container statuses, which is a bit annoying. I can't find a comprehensive list anywhere. The current (unrecoverable) list we watch for is this:
```
                elif state.waiting.reason in [
                    KubernetesWaitingReasons.ErrImagePull,
                    KubernetesWaitingReasons.ImagePullBackOff,
                    KubernetesWaitingReasons.CrashLoopBackOff,
                    KubernetesWaitingReasons.RunContainerError,
                ]:
```

According to [the docs](https://kubernetes.io/docs/tasks/debug/debug-application/debug-init-containers/), we may simply need to add `Init:Error` to this list.

### Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dagster-k8s] Init container failures cause Pipes to hang #32428

What's the issue?

What did you expect to happen?

How to reproduce?

Dagster version

Deployment type

Deployment details

Additional information

Message from the maintainers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[dagster-k8s] Init container failures cause Pipes to hang #32428

Description

What's the issue?

What did you expect to happen?

How to reproduce?

Dagster version

Deployment type

Deployment details

Additional information

Message from the maintainers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions