Skip to content

CI

CI #4991

Triggered via schedule October 29, 2025 09:33
Status Failure
Total duration 3h 42m 15s
Artifacts 46

ci.yaml

on: schedule
metadata
3s
metadata
bump-manifest
21s
bump-manifest
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
amd64  /  ...  /  build-base
3m 24s
amd64 / build-base / build-base
arm64  /  ...  /  build-base
3m 27s
arm64 / build-base / build-base
amd64  /  ...  /  build-mpi-operator-compatible-base
1m 54s
amd64 / test-nccl / build-mpi-operator-compatible-base
amd64  /  ...  /  build-nccl-gke
2m 22s
amd64 / test-nccl / nccl-test-gke / build-nccl-gke
arm64  /  ...  /  build-mpi-operator-compatible-base
arm64 / test-nccl / build-mpi-operator-compatible-base
arm64  /  ...  /  build-nccl-gke
arm64 / test-nccl / nccl-test-gke / build-nccl-gke
Matrix: amd64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Matrix: amd64 / test-jax / run-unit-test
Matrix: amd64 / test-te-a100 / run-unit-test
Matrix: amd64 / test-te-h100 / te-test-h100
amd64  /  ...  /  launch-slurm-runner
42m 55s
amd64 / test-jax / runner / launch-slurm-runner
amd64  /  test-nsys-jax-eks
4m 6s
amd64 / test-nsys-jax-eks
amd64  /  ...  /  launch-slurm-runner
2h 43m
amd64 / test-te-a100 / runner / launch-slurm-runner
amd64  /  build-upstream-t5x
8m 38s
amd64 / build-upstream-t5x
Matrix: amd64 / test-nsys-jax / run-unit-test
amd64  /  build-equinox
6m 5s
amd64 / build-equinox
amd64  /  ...  /  launch-slurm-runner
31m 56s
amd64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: amd64 / test-nccl / nccl-test
Matrix: amd64 / test-nccl / nccl-test-gke / nccl-gke
Matrix: arm64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Waiting for pending jobs
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-a100 / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-h100 / te-test-h100
Waiting for pending jobs
arm64  /  test-nsys-jax-eks
arm64 / test-nsys-jax-eks
arm64  /  ...  /  launch-slurm-runner
arm64 / test-jax / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-a100 / runner / launch-slurm-runner
arm64  /  build-upstream-t5x
9m 54s
arm64 / build-upstream-t5x
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
arm64  /  ...  /  launch-slurm-runner
arm64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: arm64 / test-nccl / nccl-test
Waiting for pending jobs
Matrix: arm64 / test-nccl / nccl-test-gke / nccl-gke
Waiting for pending jobs
amd64  /  ...  /  maxtext-gke-xpk
1h 0m
amd64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: amd64 / test-maxtext / maxtext-multinode
Matrix: amd64 / test-maxtext / single-process-multi-device
amd64  /  ...  /  build-rosetta
15m 18s
amd64 / build-rosetta-t5x / build-rosetta
amd64  /  test-axlearn-eks
37m 1s
amd64 / test-axlearn-eks
amd64  /  test-axlearn-fuji-models-eks
5m 33s
amd64 / test-axlearn-fuji-models-eks
Matrix: amd64 / test-nsys-jax-archive
arm64  /  ...  /  maxtext-gke-xpk
arm64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
arm64  /  ...  /  build-rosetta
18m 57s
arm64 / build-rosetta-t5x / build-rosetta
arm64  /  test-axlearn-eks
0s
arm64 / test-axlearn-eks
arm64  /  test-axlearn-fuji-models-eks
0s
arm64 / test-axlearn-fuji-models-eks
Matrix: arm64 / test-nsys-jax-archive
amd64  /  ...  /  test-maxtext-metrics
20s
amd64 / test-maxtext / test-maxtext-metrics
amd64  /  collect-docker-tags
4s
amd64 / collect-docker-tags
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
arm64  /  ...  /  test-maxtext-metrics
arm64 / test-maxtext / test-maxtext-metrics
arm64  /  collect-docker-tags
4s
arm64 / collect-docker-tags
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
amd64  /  ...  /  sitrep
8s
amd64 / test-maxtext / test-maxtext-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-summary
3s
amd64 / test-rosetta-t5x / test-t5x-rosetta-summary
amd64  /  ...  /  test-t5x-rosetta-metrics
17s
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
arm64  /  ...  /  sitrep
arm64 / test-maxtext / test-maxtext-sitrep / sitrep
arm64  /  ...  /  test-t5x-rosetta-summary
arm64 / test-rosetta-t5x / test-t5x-rosetta-summary
arm64  /  ...  /  test-t5x-rosetta-metrics
arm64 / test-rosetta-t5x / test-t5x-rosetta-metrics
amd64  /  ...  /  test-maxtext-outcome
2s
amd64 / test-maxtext / test-maxtext-outcome
amd64  /  ...  /  sitrep
7s
amd64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
arm64  /  ...  /  test-maxtext-outcome
arm64 / test-maxtext / test-maxtext-outcome
arm64  /  ...  /  sitrep
arm64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-outcome
2s
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
arm64  /  ...  /  test-t5x-rosetta-outcome
arm64 / test-rosetta-t5x / test-t5x-rosetta-outcome
make-publish-configs
3s
make-publish-configs
merge-new-manifest
12s
merge-new-manifest
Matrix: publish-containers
finalize  /  workflow-badge
7s
finalize / workflow-badge
finalize  /  report
11s
finalize / report
finalize  /  upload-badge
6s
finalize / upload-badge
finalize  /  publish-badge
6s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

8 errors and 2 warnings
amd64 / test-te-h100 / te-test-h100 (unittest, 8)
Process completed with exit code 1.
amd64 / test-jax / jax-A100-unit-test
Process completed with exit code 1.
amd64 / test-nsys-jax / nsys-jax-A100-unit-test
EACCES: permission denied, scandir '/runner/_work/JAX-Toolbox/JAX-Toolbox/pytest-tmp'
amd64 / test-maxtext-gke / maxtext-gke-xpk
Process completed with exit code 1.
amd64 / test-te-a100 / te-A100-unit-test
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
amd64 / test-maxtext / test-maxtext-outcome
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.
merge-new-manifest
Unexpected input(s) 'owner_and_repo', valid inputs are ['route', 'mediaType']
merge-new-manifest
Unexpected input(s) 'owner_and_repo', 'head', 'base', 'body', 'title', 'draft', valid inputs are ['route', 'mediaType']

Artifacts

Produced during runtime
Name Size Digest
artifact-axlearn-build-amd64
567 Bytes
sha256:7302e504fb2229b89c3736aa463e55f7333d73b2aca18d53efa87ccf97ac5a55
artifact-axlearn-build-arm64
566 Bytes
sha256:99b239fc21f34632bff7b2f003104522fc2732c1527a987ca90a504113053f0b
artifact-axlearn-test
177 KB
sha256:5ab8ca10fad63829bb52455e9108d0516a882aa560ddb731510c4ba92dce8878
artifact-base-build-amd64
567 Bytes
sha256:468e04512ef86a8298b0b5ba5acbb28546facd9c23ee7a4cb299a88b377fc236
artifact-base-build-arm64
566 Bytes
sha256:3052da110048413359115f39e9fceba1407c521e347bc088c09d328baa3335e1
artifact-equinox-build-amd64
569 Bytes
sha256:bc09bcba416fd015510900e921218137227cbdeec5d2bf4eab1c910dbaf2082b
artifact-equinox-build-arm64
568 Bytes
sha256:935e5a03e522e32294d5ac4077bce30794a64461c31e91865aa55e51083fa359
artifact-final-report
3.62 KB
sha256:dea96c96276f5d07ebe86fa94653e421049440e7acc76d56bb8c07d625af9b57
artifact-jax-build-amd64
554 Bytes
sha256:449b6bbaf7cf2d7ed71d7651d550b0338b43e03a58933aa61a6caec99f3d2483
artifact-jax-build-arm64
554 Bytes
sha256:e886802e1d75c6dfaeff6287a4582b1220a338c786d1633ee28f5a393a6854c4
artifact-maxtext-build-amd64
567 Bytes
sha256:19deae390e979d14bcdce894fa82072cf952d7acb5ed58623dbb93658b97290f
artifact-maxtext-build-arm64
569 Bytes
sha256:86257bef07bc50f6d20be8cb989f125a45ec880741ac25ecbf2701eb93fb5dc0
artifact-maxtext-test
1.46 KB
sha256:eef93926361eae9efe2dedfc947893883eb03b22f540177520029f0922900741
artifact-mpi-operator-compatible-base-build-amd64
638 Bytes
sha256:b02d684ff891ab93386e832d670419b6d25e58dfb8a86114e82ee365ac28245e
artifact-nccl-gke-build-amd64
573 Bytes
sha256:775f34bb57adbe376f25f28ba2aff9a228b796b801662de2780695032f6fa17e
artifact-rosetta-build-t5x-amd64
583 Bytes
sha256:11dfdcf63df7e7c69057b67dc693c7d259ad0c91adbb87e32670a94b0a30e75b
artifact-rosetta-build-t5x-arm64
584 Bytes
sha256:fd204f48646b61b96c0bc053610dd96bd62376223e7306b280e13303a76e0145
artifact-rosetta-t5x-mgmn-test
624 Bytes
sha256:4b8864fa83fdd67838eaa1742e6afb887b4d857443590a18db4c5f721234fca6
artifact-t5x-build-amd64
569 Bytes
sha256:46bd3edb0299ecbc51830e0795c3756be2ebd2f71ed14ca152fb845865353482
artifact-t5x-build-arm64
567 Bytes
sha256:c3b05be36ccd5a78fffa34eaa5f98d153faf3294ae09fb1795152be212106de5
artifact-workflow-metadata
277 Bytes
sha256:12c881dda01a354cb5f0e0702e590fffa692e45c6d50ce822359739657261857
bumped-manifest
51.5 KB
sha256:098dea502daa3fd6ce9d2627d088354bbc58f8d784377de49ec41282232ff852
final-axlearn
258 Bytes
sha256:d686648f9bba3e5500605c8bd91be9a9b2a06209237d0ca7a8eb8a1ee31aaccd
final-base
249 Bytes
sha256:8ee3819ae0c138a275d46833ea165ba08170cddbf2b68f7cf4eb951aec8f3aba
final-equinox
258 Bytes
sha256:677c21bc6f3fa8045708a13c139af997c2aff5452cfc3170e1b6534d4ff4b439
final-jax
246 Bytes
sha256:4718ce02ba02ebf1f92ea66d2baffd83e088e8d517bfbf789eb2eafdc15d15c9
final-maxtext
258 Bytes
sha256:e17dff0cddff1902073d424f46bfe005de740794e878fc2706f19b1ce3b8e578
final-t5x
246 Bytes
sha256:7df3cde3f6dded7e0140c60672e1ec6f8f34eb2b2ddab38e7ec19c5959afc441
final-upstream-t5x
273 Bytes
sha256:4a23c25af097c02b7a6f98a58febf4e3522bbd978fbc7fe13dddc588879c1e22
jax-cutlass-test-H100
1.24 KB
sha256:4ec88d74352f2b2ae5125aa80dacc6c90894fa7eb32bcffde1a8467d13bee84e
jax-unit-test-A100
30.4 KB
sha256:52c648f2c828732792f7e16bc539914c91350b94849a77e508bb02050e74c8f3
mealkit-axlearn
268 Bytes
sha256:94b780d5e858ce07e8885a1993050828b7d266e5b2f0e816d96ddad8571a8896
mealkit-equinox
269 Bytes
sha256:3b3c9eac9d84d66d21a395d288a11ab3e42d4d687c5f63a3ee75644ec6581513
mealkit-jax
256 Bytes
sha256:8b8282c79b7ca859560caa937f1834ecd40d24032438c7d1b0186783eb8c30a5
mealkit-maxtext
268 Bytes
sha256:91d887f86630dc8d1255fe51f5c8745f4760c5b290ee8c1bf63e4bea3f156c63
mealkit-t5x
257 Bytes
sha256:3b9636c5f8877f84cd903c9a0942273c50fe7db9f2b2ac6a0fb1c3c47f413f24
mealkit-upstream-t5x
281 Bytes
sha256:9f8dbdd3ddfe9d51a342e627d3d61a640cecc0ae5f9a41dab265a083b271df75
nccl-gke-all-gather
15.4 KB
sha256:d8c9da7dd8eb9543478339545c9b6564bc634059c6bb4804cbc6c0d1feb6b4ab
nccl-gke-all-reduce
15.5 KB
sha256:58a014df6f0d2e11419959c99836e13ed80f44e468b93b138b791c3801952123
nccl-gke-broadcast
15.3 KB
sha256:3ac58f6becf9dcb16d8d1be8b3cd6433c3428538a7290e080f2ab4b167db8773
nccl-gke-reduce-scatter
15.5 KB
sha256:86998db55b91d8a5d67e14e25fd351eb10b0c061d352d5b2022d41665db2ea0a
rosetta-t5x-vit-18903330301-VIT8G1N
16.4 KB
sha256:9a6dff116ed12bfca3e764eccd0676512078faeb4e5ee5c6b108b72b9eb09403
te-unit-test-H100
2.09 MB
sha256:d80cf607842ab710cb94adfea9cb5a9d81a3c9de4c8a35b47bc5378e69c85161
upstream-maxtext-18903330301-1DP2FSDP4TP1PP_single_process
22.8 KB
sha256:d2b90ae28fec260745072b0c1bd4d6fdf6dd2c60718a3cf77df19a886a3bd8b7
upstream-maxtext-18903330301-2DP2FSDP2TP1PP
27.8 KB
sha256:a8b3e2234b14e3791f00a41e97bbaaaae318ab8faab965ad76514e8d23905a9e
upstream-maxtext-metrics-test-log
2.52 KB
sha256:f8477004d681c53c195905b2489e846d86f14299c7b2e2f6e838f3a49cf07e82