You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This avoids deadlocks by providing basic gang scheduling. Also, the
cluster now has a few cores worth of non-GPU node capacity, so we no
longer need to run the post-processing test on the large P5 nodes.
`yq` is now pre-installed on the `eks` runner, as it is ~always needed.
# Streaming logs will fail if the container/pod is still pending
106
103
while [[ -n $(kubectl get pods --selector=batch.kubernetes.io/job-name=${LAUNCHER_NAME} --output=jsonpath='{.items[?(@.status.phase == "Pending")].metadata.name}') ]]; do
107
104
sleep 1
108
105
done
109
-
- name: Stream Kubernetes job output
110
-
# Note that this is *not* JOB_NAME
111
-
# TODO: --all-containers=true --all-pods=true could make sense here
112
-
run: kubectl logs --follow job/${LAUNCHER_NAME}
106
+
# TODO: --all-containers=true --all-pods=true could make sense here, but it
107
+
# prefixes lines with a rather verbose tag
108
+
kubectl logs --follow job/${LAUNCHER_NAME}
113
109
- name: Retrieve Kubernetes job status
114
110
shell: bash -exo pipefail {0}
115
111
run: |
@@ -135,7 +131,7 @@ jobs:
135
131
run: |
136
132
# Provide better debug in case of launch failures that will not produce log output
137
133
pods=$(kubectl get pods --selector=batch.kubernetes.io/job-name=${LAUNCHER_NAME} -o name)
0 commit comments