Skip to content

Just another casual day of refactoring MaxText PR#2541 #5017

Just another casual day of refactoring MaxText PR#2541

Just another casual day of refactoring MaxText PR#2541 #5017

Triggered via pull request November 4, 2025 17:21
Status Failure
Total duration 3h 33m 39s
Artifacts 53

ci.yaml

on: pull_request
metadata
3s
metadata
bump-manifest
28s
bump-manifest
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
amd64  /  ...  /  build-base
2m 28s
amd64 / build-base / build-base
arm64  /  ...  /  build-base
3m 2s
arm64 / build-base / build-base
amd64  /  ...  /  build-mpi-operator-compatible-base
1m 50s
amd64 / test-nccl / build-mpi-operator-compatible-base
amd64  /  ...  /  build-nccl-gke
1m 50s
amd64 / test-nccl / nccl-test-gke / build-nccl-gke
arm64  /  ...  /  build-mpi-operator-compatible-base
arm64 / test-nccl / build-mpi-operator-compatible-base
arm64  /  ...  /  build-nccl-gke
arm64 / test-nccl / nccl-test-gke / build-nccl-gke
Matrix: amd64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Matrix: amd64 / test-jax / run-unit-test
Matrix: amd64 / test-te-a100 / run-unit-test
Matrix: amd64 / test-te-h100 / te-test-h100
amd64  /  ...  /  launch-slurm-runner
38m 0s
amd64 / test-jax / runner / launch-slurm-runner
amd64  /  test-nsys-jax-eks
3m 56s
amd64 / test-nsys-jax-eks
amd64  /  ...  /  launch-slurm-runner
2h 38m
amd64 / test-te-a100 / runner / launch-slurm-runner
amd64  /  build-upstream-t5x
7m 33s
amd64 / build-upstream-t5x
amd64  /  build-axlearn
6m 3s
amd64 / build-axlearn
Matrix: amd64 / test-nsys-jax / run-unit-test
amd64  /  ...  /  launch-slurm-runner
12m 13s
amd64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: amd64 / test-nccl / nccl-test
Matrix: amd64 / test-nccl / nccl-test-gke / nccl-gke
Matrix: arm64 / test-jax-cutlass-h100 / jax-cutlass-test-h100
Waiting for pending jobs
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-a100 / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-te-h100 / te-test-h100
Waiting for pending jobs
arm64  /  test-nsys-jax-eks
0s
arm64 / test-nsys-jax-eks
arm64  /  ...  /  launch-slurm-runner
arm64 / test-jax / runner / launch-slurm-runner
arm64  /  ...  /  launch-slurm-runner
arm64 / test-te-a100 / runner / launch-slurm-runner
arm64  /  build-upstream-t5x
9m 17s
arm64 / build-upstream-t5x
arm64  /  build-axlearn
8m 6s
arm64 / build-axlearn
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
arm64  /  ...  /  launch-slurm-runner
arm64 / test-nsys-jax / runner / launch-slurm-runner
Matrix: arm64 / test-nccl / nccl-test
Waiting for pending jobs
Matrix: arm64 / test-nccl / nccl-test-gke / nccl-gke
Waiting for pending jobs
amd64  /  ...  /  maxtext-gke-xpk
12m 40s
amd64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: amd64 / test-maxtext / maxtext-multinode
Matrix: amd64 / test-maxtext / single-process-multi-device
amd64  /  ...  /  build-rosetta
14m 58s
amd64 / build-rosetta-t5x / build-rosetta
amd64  /  test-axlearn-eks
29m 58s
amd64 / test-axlearn-eks
amd64  /  test-axlearn-fuji-models-eks
5m 35s
amd64 / test-axlearn-fuji-models-eks
Matrix: amd64 / test-nsys-jax-archive
arm64  /  ...  /  maxtext-gke-xpk
arm64 / test-maxtext-gke / maxtext-gke-xpk
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
arm64  /  ...  /  build-rosetta
15m 45s
arm64 / build-rosetta-t5x / build-rosetta
arm64  /  test-axlearn-eks
arm64 / test-axlearn-eks
arm64  /  test-axlearn-fuji-models-eks
0s
arm64 / test-axlearn-fuji-models-eks
Matrix: arm64 / test-nsys-jax-archive
amd64  /  ...  /  test-maxtext-metrics
26s
amd64 / test-maxtext / test-maxtext-metrics
amd64  /  collect-docker-tags
4s
amd64 / collect-docker-tags
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
arm64  /  ...  /  test-maxtext-metrics
arm64 / test-maxtext / test-maxtext-metrics
arm64  /  collect-docker-tags
3s
arm64 / collect-docker-tags
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
amd64  /  ...  /  sitrep
27s
amd64 / test-maxtext / test-maxtext-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-summary
2s
amd64 / test-rosetta-t5x / test-t5x-rosetta-summary
amd64  /  ...  /  test-t5x-rosetta-metrics
28s
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
arm64  /  ...  /  sitrep
arm64 / test-maxtext / test-maxtext-sitrep / sitrep
arm64  /  ...  /  test-t5x-rosetta-summary
arm64 / test-rosetta-t5x / test-t5x-rosetta-summary
arm64  /  ...  /  test-t5x-rosetta-metrics
arm64 / test-rosetta-t5x / test-t5x-rosetta-metrics
amd64  /  ...  /  test-maxtext-outcome
4s
amd64 / test-maxtext / test-maxtext-outcome
amd64  /  ...  /  sitrep
22s
amd64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
arm64  /  ...  /  test-maxtext-outcome
arm64 / test-maxtext / test-maxtext-outcome
arm64  /  ...  /  sitrep
arm64 / test-rosetta-t5x / test-t5x-rosetta-sitrep / sitrep
amd64  /  ...  /  test-t5x-rosetta-outcome
3s
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
arm64  /  ...  /  test-t5x-rosetta-outcome
arm64 / test-rosetta-t5x / test-t5x-rosetta-outcome
make-publish-configs
4s
make-publish-configs
merge-new-manifest
0s
merge-new-manifest
Matrix: publish-containers
finalize  /  workflow-badge
6s
finalize / workflow-badge
finalize  /  report
18s
finalize / report
finalize  /  upload-badge
11s
finalize / upload-badge
finalize  /  publish-badge
3s
finalize / publish-badge
Fit to window
Zoom out
Zoom in

Annotations

4 errors
amd64 / test-te-a100 / te-A100-unit-test
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
amd64 / test-maxtext / test-maxtext-outcome
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-metrics
Process completed with exit code 1.
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.

Artifacts

Produced during runtime
Name Size Digest
artifact-axlearn-build-amd64
566 Bytes
sha256:b0a06b743e2fbf28916e3b6b7420cd1096e4104c37572b5897b99965d0b9fd68
artifact-axlearn-build-arm64
569 Bytes
sha256:fe3e2e1e2110cc02de0e396298976e82f6bcaa42ac76acbb8d6fa0cb7186ae4a
artifact-axlearn-test
177 KB
sha256:3ce7bc52aa7ba69d1eee04f9d7d1b090d6600b802c23e4f29796c0c893a59014
artifact-base-build-amd64
567 Bytes
sha256:a7797d84be1d7d8c6919440e6bf0d6fdd1d82e4af9a7859d2cab2c1ef778c2f0
artifact-base-build-arm64
566 Bytes
sha256:27c074af4ab6a88b496338e436e5e0824ce6b5396217560ce75465eb413107e1
artifact-equinox-build-amd64
570 Bytes
sha256:c2b810330d16e6f4b767aa1e171e1bdc16285033776713aaf46d65c71d388b8e
artifact-equinox-build-arm64
569 Bytes
sha256:5c3eb3125c717c56332da360a233418d8667d6f0b2c39cec49a7eb6841a50d15
artifact-final-report
3.87 KB
sha256:06647ed433b85ff6cff8336efa0303a8225a2b3617522b4c7cb3c88217044010
artifact-jax-build-amd64
552 Bytes
sha256:7620a338d38c6c21791b2828c0993837f816ea74fc778649a4f591cc5f5524f2
artifact-jax-build-arm64
554 Bytes
sha256:7b9599ecf4aa596dc61ef9602ebad54c2d300d265967fea9cbbb90300a3079fc
artifact-maxtext-build-amd64
569 Bytes
sha256:4796622291e9a7ae5f34bbac951509fff8bfe9a5edfaec506883fedab98fbd01
artifact-maxtext-build-arm64
568 Bytes
sha256:f6aa8f78cb905ae59901fa3ef3a6a1b677380da1c62385732b7ac225389a5a77
artifact-maxtext-test
1.46 KB
sha256:69a865f779e43dfae7bf7b56f3c74b9421ea57f87d15b8ddcaac4138e81da897
artifact-mpi-operator-compatible-base-build-amd64
637 Bytes
sha256:9a6ec3e5e6950e5d1434becafb70620e6b9186393671efe9cc4961ee8460ded2
artifact-nccl-gke-build-amd64
571 Bytes
sha256:0d7fcddad949b8d6eff16999e131d7f878a544dfd4f6c77618b56ebd74e78534
artifact-rosetta-build-t5x-amd64
585 Bytes
sha256:e5fd205c9fec72b845be2f7c32a0cf74759eafdbd9919884f40ceb4867fe5116
artifact-rosetta-build-t5x-arm64
584 Bytes
sha256:eed945cf0f5f8b0a537a8ed5fbc40b60fe800190d73ea72b35544b376b717b86
artifact-rosetta-t5x-mgmn-test
624 Bytes
sha256:f856c89292379cb9a333edf56c593fae8d8f5a86b94cb58528eb52db3aee794b
artifact-t5x-build-amd64
568 Bytes
sha256:37c9e19bffa840882f54273ed08b45e330c433a34b4cfde987b352b9997979fa
artifact-t5x-build-arm64
568 Bytes
sha256:c02b55992076b4b767869b5410ca9a8e0ae3b277bf0ae24f6cd9ec4a0382ffd2
artifact-workflow-metadata
278 Bytes
sha256:f6b78270301712891b5b20853909105d0dfdc78430d461e76638e9904fc89d1e
bumped-manifest
51.5 KB
sha256:7e8f16822337f80b40bcff72c6e414f6778a665b879d788ce519cfc1fad493cd
final-axlearn
263 Bytes
sha256:4800182280a7cfa639290aa06e1088db63af59310b144b0fa645b267421f6f0c
final-base
254 Bytes
sha256:aec9e6ed3ced1150ef2b22394dcc0da78bafca2f04f2b3f8d90e4d33f6e9e5be
final-equinox
263 Bytes
sha256:3de2ed58794aa213d6444c8c1c5f4c9d9396e10e93ad52d9dd6016db62f3df68
final-jax
251 Bytes
sha256:e17476455207524eabb452cc1186bef617739ef4868ff0bd41effa2f8f086c08
final-maxtext
262 Bytes
sha256:2172380a27eefadbf823ac116ab774dde22eb8b3412a49bbc8c1cdfed8e0d45a
final-t5x
251 Bytes
sha256:b9213ad76b9413538c63b5b8b60db1c26f42fb738143dee9b428ffa6798a71a2
final-upstream-t5x
276 Bytes
sha256:9aad3dcebe53b9bb04355b87390c72b74d4447188b1e6e2a1417e83ee4b01e0c
gke-maxtext-train
369 MB
sha256:24315371f7bebf5d03e393463b0d3b7b5e2c29c2dcd97f9d41479c87e79dd082
gke-maxtext-train-sitrep
228 Bytes
sha256:687807f4b6a32c4f1fa3fae8326cb6fde5d65f0bed5eca5086b982bc28bd4d6a
jax-cutlass-test-H100
4.24 KB
sha256:67585a8bc5b864cbc0da45dc8ae288a94e5da404890a2f7a20e3b9b641e4cd21
jax-unit-test-A100
22.2 KB
sha256:1c52e3a27787625f72fb48ae1db22fddae482edfe8c5e6722725caa5b4840c45
mealkit-axlearn
271 Bytes
sha256:2cff2a3a84cf6ae26f7d7f111934df938bb45c0ddc85faa0fb80ed76b90fc741
mealkit-equinox
271 Bytes
sha256:41616dc1fd0453a6e76d053a3b0cd010dbc561c4e01b7a290f8931f792ec6d30
mealkit-jax
261 Bytes
sha256:e35b9987a0ac3ef1b034bbefd3acbf815c0aa131e50b59cbf142eee0c952e413
mealkit-maxtext
271 Bytes
sha256:577788bf18d06f9c7b38f4545ecde8893fdb6399dc2a4f7b1963b0e9ae6869a1
mealkit-t5x
260 Bytes
sha256:284398a5ff8494219bf19cce43409c2d47661d262f35591dcf6f56c6f4b4d90d
mealkit-upstream-t5x
286 Bytes
sha256:967b4e174a13b7ad0898c9167f5f30561e187edb9e741a917a33195d67b6f213
nccl-gke-all-gather
15.4 KB
sha256:7fdff3ae33ea1c2326460ccf11e0420f8d7ac80c0c69652c17ad6318f0fcdca1
nccl-gke-all-gather-sitrep
231 Bytes
sha256:4a7820f07b91af5939d6eabbb5afee52d145ecf3877b0cd7cd0a380e8e993bc2
nccl-gke-all-reduce
15.5 KB
sha256:6b1b1f057e2d86d0a1b1738f96b2b6ff7ac50985d33a871c2ab13496fd09c6ac
nccl-gke-all-reduce-sitrep
231 Bytes
sha256:f3af2e86fcf14d02ca861b882ce6c57812c3f3d1f5aa239596e77b84e6ff95fa
nccl-gke-broadcast
15.3 KB
sha256:0f7d8576e1e05c6a8472f52d3bdc48bd2f9b7c1fe1715b1f7b7f2ec021641e11
nccl-gke-broadcast-sitrep
229 Bytes
sha256:6f1a31edd7f67403d937ce420365cc2084b71f9def555d6e82640fa769cf3f44
nccl-gke-reduce-scatter
15.5 KB
sha256:3d388d20db529c53e14b66f79d619f9e41b52b38873c2265788eb3aedb49fd8a
nccl-gke-reduce-scatter-sitrep
234 Bytes
sha256:6b51bcddd48e3e31108dce1e735f315d1541ad9143a49bebeb5cc084966f7677
nsys-jax-unit-test-A100
127 MB
sha256:19f88819762890b6ab9b25157aec61de3e8b0504b95c74bc70bdd721ceed2432
rosetta-t5x-vit-19077169003-VIT8G1N
15.5 KB
sha256:100593a5c7c2ac1338ccc20b3fc072a92791403ac4e682f69230bbfbbc58a8ba
te-unit-test-H100
2.08 MB
sha256:ff7997bff09a2ac3aafd929daa3dbb27c61d9f8e2ee10ac0783dc1c18583bf76
upstream-maxtext-19077169003-1DP2FSDP4TP1PP_single_process
23.2 KB
sha256:98b4ce4dd09784b829226b74d9791b2f4549e1c3e21adf4e3023e8c1e05e06da
upstream-maxtext-19077169003-2DP2FSDP2TP1PP
28.3 KB
sha256:e91341bf7655b034565950df1126522d7b1cd9fcb1d28bc3c1d913324e8a2841
upstream-maxtext-metrics-test-log
2.51 KB
sha256:febbe58df96d4e35fc67cb0ba403bb6cd658d3da0b4738500a21896bf14de6e0