Skip to content

CI

CI #5013

Triggered via schedule November 4, 2025 09:35
Status Failure
Total duration 2h 35m 3s
Artifacts 45

ci.yaml

on: schedule
metadata
3s
metadata
bump-manifest
17s
bump-manifest
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
amd64  /  ...  /  build-base
4m 7s
amd64 / build-base / build-base
arm64  /  ...  /  build-base
3m 3s
arm64 / build-base / build-base
amd64  /  ...  /  build-mpi-operator-compatible-base
2m 40s
amd64 / test-nccl / build-mpi-operator-compatible-base
amd64  /  ...  /  build-nccl-gke
1m 59s
amd64 / test-nccl / nccl-test-gke / build-nccl-gke
arm64  /  ...  /  build-mpi-operator-compatible-base
arm64 / test-nccl / build-mpi-operator-compatible-base
arm64  /  ...  /  build-nccl-gke
arm64 / test-nccl / nccl-test-gke / build-nccl-gke
Matrix: amd64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Matrix: amd64 / test-jax / run-unit-test
Matrix: amd64 / test-te-a100 / run-unit-test
Matrix: amd64 / test-te-h100 / te-test-h100
amd64  /  ...  /  launch-slurm-runner
40m 26s
amd64 / test-jax / runner / launch-slurm-runner
amd64  /  test-nsys-jax-eks
4m 1s
amd64 / test-nsys-jax-eks
amd64  /  ...  /  launch-slurm-runner
29m 29s
amd64 / test-te-a100 / runner / launch-slurm-runner
amd64  /  build-upstream-t5x
6m 52s
amd64 / build-upstream-t5x
amd64  /  build-axlearn
6m 3s
amd64 / build-axlearn
Matrix: amd64 / test-nsys-jax / run-unit-test
amd64  /  ...  /  launch-slurm-runner
1h 41m
amd64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: amd64 / test-nccl / nccl-test
Matrix: amd64 / test-nccl / nccl-test-gke / nccl-gke
Matrix: arm64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Waiting for pending jobs
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-a100 / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-h100 / te-test-h100
Waiting for pending jobs
arm64  /  test-nsys-jax-eks
0s
arm64 / test-nsys-jax-eks
arm64  /  ...  /  launch-slurm-runner
arm64 / test-jax / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-a100 / runner / launch-slurm-runner
arm64  /  build-upstream-t5x
9m 41s
arm64 / build-upstream-t5x
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
arm64  /  ...  /  launch-slurm-runner
arm64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: arm64 / test-nccl / nccl-test
Waiting for pending jobs
Matrix: arm64 / test-nccl / nccl-test-gke / nccl-gke
Waiting for pending jobs
amd64  /  ...  /  maxtext-gke-xpk
amd64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: amd64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: amd64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
amd64  /  ...  /  build-rosetta
13m 6s
amd64 / build-rosetta-t5x / build-rosetta
amd64  /  test-axlearn-eks
25m 21s
amd64 / test-axlearn-eks
amd64  /  test-axlearn-fuji-models-eks
5m 28s
amd64 / test-axlearn-fuji-models-eks
Matrix: amd64 / test-nsys-jax-archive
arm64  /  ...  /  maxtext-gke-xpk
arm64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
arm64  /  ...  /  build-rosetta
9m 40s
arm64 / build-rosetta-t5x / build-rosetta
arm64  /  test-axlearn-eks
0s
arm64 / test-axlearn-eks
arm64  /  test-axlearn-fuji-models-eks
0s
arm64 / test-axlearn-fuji-models-eks
Matrix: arm64 / test-nsys-jax-archive
amd64  /  ...  /  test-maxtext-metrics
amd64 / test-maxtext / test-maxtext-metrics
amd64  /  collect-docker-tags
3s
amd64 / collect-docker-tags
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
arm64  /  ...  /  test-maxtext-metrics
arm64 / test-maxtext / test-maxtext-metrics
arm64  /  collect-docker-tags
2s
arm64 / collect-docker-tags
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
amd64  /  ...  /  sitrep
amd64 / test-maxtext / test-maxtext-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-summary
4s
amd64 / test-rosetta-t5x / test-t5x-rosetta-summary
amd64  /  ...  /  test-t5x-rosetta-metrics
28s
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
arm64  /  ...  /  sitrep
arm64 / test-maxtext / test-maxtext-sitrep / sitrep
arm64  /  ...  /  test-t5x-rosetta-summary
arm64 / test-rosetta-t5x / test-t5x-rosetta-summary
arm64  /  ...  /  test-t5x-rosetta-metrics
arm64 / test-rosetta-t5x / test-t5x-rosetta-metrics
amd64  /  ...  /  test-maxtext-outcome
amd64 / test-maxtext / test-maxtext-outcome
amd64  /  ...  /  sitrep
17s
amd64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
arm64  /  ...  /  test-maxtext-outcome
arm64 / test-maxtext / test-maxtext-outcome
arm64  /  ...  /  sitrep
arm64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-outcome
3s
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
arm64  /  ...  /  test-t5x-rosetta-outcome
arm64 / test-rosetta-t5x / test-t5x-rosetta-outcome
make-publish-configs
3s
make-publish-configs
merge-new-manifest
8s
merge-new-manifest
Matrix: publish-containers
finalize  /  workflow-badge
8s
finalize / workflow-badge
finalize  /  report
10s
finalize / report
finalize  /  upload-badge
14s
finalize / upload-badge
finalize  /  publish-badge
6s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

7 errors and 2 warnings
amd64 / build-maxtext
buildx failed with: ERROR: failed to build: failed to solve: process "/bin/sh -c <<\"EOF\" bash -ex -o pipefail\nfor pattern in \\\n \"s|^tensorflow$|tensorflow==2.18.1|g\" \\\n \"s|^tensorflow-text$|tensorflow-text==2.18.1|g\" \\\n \"s|^jax!=.*|jax|g\" \\\n \"s|^jaxlib!=.*|jaxlib|g\" \\\n ; do\n # tensorflow-cpu,tensorboard,tensorflow-text>=2.19.0 is incompatible with tensorflow==2.18.1\n sed -i \"${pattern}\" ${SRC_PATH_MAXTEXT}/base_requirements/requirements.txt\ndone\nEOF" did not complete successfully: exit code: 2
arm64 / build-maxtext
buildx failed with: ERROR: failed to build: failed to solve: process "/bin/sh -c <<\"EOF\" bash -ex -o pipefail\nfor pattern in \\\n \"s|^tensorflow$|tensorflow==2.18.1|g\" \\\n \"s|^tensorflow-text$|tensorflow-text==2.18.1|g\" \\\n \"s|^jax!=.*|jax|g\" \\\n \"s|^jaxlib!=.*|jaxlib|g\" \\\n ; do\n # tensorflow-cpu,tensorboard,tensorflow-text>=2.19.0 is incompatible with tensorflow==2.18.1\n sed -i \"${pattern}\" ${SRC_PATH_MAXTEXT}/base_requirements/requirements.txt\ndone\nEOF" did not complete successfully: exit code: 2
amd64 / test-te-h100 / te-test-h100 (unittest, 8)
Process completed with exit code 1.
arm64 / build-rosetta-t5x / build-rosetta
buildx failed with: ERROR: failed to build: failed to solve: process "/bin/sh -c pip-finalize.sh" did not complete successfully: exit code: 1
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.
amd64 / test-te-a100 / te-A100-unit-test
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
merge-new-manifest
Unexpected input(s) 'owner_and_repo', valid inputs are ['route', 'mediaType']
merge-new-manifest
Unexpected input(s) 'owner_and_repo', 'head', 'base', 'body', 'title', 'draft', valid inputs are ['route', 'mediaType']

Artifacts

Produced during runtime
Name Size Digest
artifact-axlearn-build-amd64
567 Bytes
sha256:026f73c9fa393e16492d55ab22f656554b90d8d63f96defb21d02a9acf4815bb
artifact-axlearn-build-arm64
566 Bytes
sha256:9853d6f145572a6e1f7a6d5e5e83efdeb2f92e25777d084b499df8c2237c3fc5
artifact-axlearn-test
177 KB
sha256:d5db45afa5f0a100a644001c9db264f2afbf760e09d30055a5721aa52c90c181
artifact-base-build-amd64
566 Bytes
sha256:a838f3baf355e59f1fbdceb09cf1ac76183abacf0d265a86a1beaeed3ef2caec
artifact-base-build-arm64
566 Bytes
sha256:fb7b6275c6b243f9d88569ab1993168d1415ac0fdeabf64ad9610cea7aec7c75
artifact-equinox-build-amd64
570 Bytes
sha256:941c325acd72a3b3b2319b40d8910410073c8f091c38507810caa855398c2078
artifact-equinox-build-arm64
568 Bytes
sha256:c4900a5c62028b40a2942a01b6c50b40cecd76d04956d4eb5cb9c9987d844632
artifact-final-report
3.03 KB
sha256:eca767909a3701dfe1613b36cfd49a91694f18a02dbf43d820a63d39a55c74aa
artifact-jax-build-amd64
554 Bytes
sha256:22299fc56266eb1dc79a6f6d0ae249c4bad0738b226a3643e73d1ce91aa8aeda
artifact-jax-build-arm64
554 Bytes
sha256:b8719e2dfe3ed5f157d1c82c9f1711036166ca733c7aa25980895266aee6c7f8
artifact-maxtext-build-amd64
473 Bytes
sha256:950647892bb5d1a1718db197c7281fed6ec3e64c92787dd2dabbf9ae19dc34f4
artifact-maxtext-build-arm64
473 Bytes
sha256:ee1380590a6eba009485c41d90d4f2bba28a5f40b294c6706600e495fd9fba48
artifact-mpi-operator-compatible-base-build-amd64
639 Bytes
sha256:0076ef287b99af26567f60d8b1b7a244d96490df2905a081c326370e05a06d65
artifact-nccl-gke-build-amd64
571 Bytes
sha256:1c51be7d1296188ee1ce31fc259990f250ce2e81f9e681547509cfe1930fa403
artifact-rosetta-build-t5x-amd64
585 Bytes
sha256:ef979fce4c012788f797cc5894722b58ad63a8298da5a76b970ca79ccdae2e6f
artifact-rosetta-build-t5x-arm64
531 Bytes
sha256:052850f4a0a514817811d27e1058645bca46be51f6ff164ee689cf7f712e2c96
artifact-rosetta-t5x-mgmn-test
624 Bytes
sha256:292def608f19df8089b86eb4072364e503591fef4e3be7bc60b5e26db4ff3761
artifact-t5x-build-amd64
568 Bytes
sha256:2d2d948244280dcdabd9eb83ab40278e5f57344d3f0b5ec9fea3b140b018650e
artifact-t5x-build-arm64
567 Bytes
sha256:72b01e62512c3654b003289c02e8dd04c2a1522a612da9c5f141d19265d65a01
artifact-workflow-metadata
278 Bytes
sha256:aaccb43854e2c907f26ff3adbe28de5df1a5c0760823acaaf25c7ea6a217c025
bumped-manifest
51.5 KB
sha256:cd503bf6473c3badbb22a7d772077b20c70a174ac1b822057b6f63763d3b2cf7
final-axlearn
258 Bytes
sha256:133448698818a6412e463271fc84c78faa431adc96cabc2ce9a26b5abf0b7f42
final-base
249 Bytes
sha256:7a0c7eed37ad5c1c2290da39d5896d99c2e5d52322505b9152de921fbebe45ed
final-equinox
258 Bytes
sha256:fd991a9ca49c074d236bd922ce87d7f554e2ede16db4d10f0dfdb95e19c1bfc7
final-jax
246 Bytes
sha256:bb64913a8045c724bf56fc0d4676fdffa09ca08e0d9d52714fa90e881665a30e
final-t5x
246 Bytes
sha256:8be7973c26802b509795f1029df37c8d95e65983478a01338cc2210bec06e94c
final-upstream-t5x
273 Bytes
sha256:d736bef255babb4d6329fc7d2721eddc1419076d57514e337ac3cca5f4f2aa1f
jax-cutlass-test-H100
1.24 KB
sha256:7cb2551211330fd5e6a490d94959039877ad39d0a0751415fcc0badf315e2fee
jax-unit-test-A100
22.1 KB
sha256:6cadee67ee6f20eec35ec25071373bfca7fc33e19088594562d567caae86c8ff
mealkit-axlearn
269 Bytes
sha256:6824c18ac46f58697e740095ef5641c3e6b0fb0eeb2dab423cb047937539504e
mealkit-equinox
269 Bytes
sha256:5e9ac3584ba5ad9b24d1b1e24c238741d7498f8afed157063c2f96db37270f38
mealkit-jax
256 Bytes
sha256:f0c0c94fc8aac20b4a077539a2106fbfc8a7520f0900acb60c0b1ff9d1104620
mealkit-t5x
257 Bytes
sha256:141c537604b40519af2cea1201462c241bd58c42217d0cb44a4e9df245a2002d
mealkit-upstream-t5x
283 Bytes
sha256:d54b363caa18134ac484f13a97855de1d3b234b544a8e821789ffb918697fd4b
nccl-gke-all-gather
15.3 KB
sha256:21d4150c643a217aec6c26d81349926dfdcb8501744a819643d30f82993a8126
nccl-gke-all-gather-sitrep
231 Bytes
sha256:f73ce92ca5c5eeca7a4fd10ec2b78c348d6d32a6e6fc0e887ea03dc9b015ed94
nccl-gke-all-reduce
15.5 KB
sha256:6939e4fef58979cebc053bb8d0b5ceb6837216b8e54d85ab15eee6e7198bc85b
nccl-gke-all-reduce-sitrep
231 Bytes
sha256:10907cd72af4962b279ea30120efceb0e48d2c4ab6aff0ebaf0e76c1001ff398
nccl-gke-broadcast
15.2 KB
sha256:adb7913e0471d967e80708878fc3abbd155249931468950a80bbca8b50a3ed82
nccl-gke-broadcast-sitrep
229 Bytes
sha256:3b51890d393d59e0cd98d374893e032b74d1ef492aea352097035bc6fb1eb0fa
nccl-gke-reduce-scatter
15.5 KB
sha256:3cbfe65d9e28d65b5833468755d20e9917b6f445f7deeaf68b7b2f2571664fa1
nccl-gke-reduce-scatter-sitrep
234 Bytes
sha256:82b47b47e9f4a7cfa4439ea35031fe03f805e18be692e483562d651b4fec99d7
nsys-jax-unit-test-A100
129 MB
sha256:0af2813914aa74988ef9783bc09beefe4180d5071d7aff30f37550cedb7237a0
rosetta-t5x-vit-19064183174-VIT8G1N
15.3 KB
sha256:cc113dc04375c3431e1fdbb91bcb6163f0338325043d487788e4830c82ea7029
te-unit-test-H100
2.08 MB
sha256:929aadea251532d5c74665258b8d789a1d7abfe709f8c23267bfd3a18372f117