Skip to content

Conversation

@RUEI4341
Copy link
Contributor

@RUEI4341 RUEI4341 commented Jan 6, 2026

Description

The test team identified several DAGs that share the same cluster and have schedules that are too close to each other. This proximity was causing resource conflicts and potential instability.

This PR staggers the DAG schedules to reduce resource contention, stabilize the cluster, and improve DAG reliability.

Action

Manually updated the cron schedules for the following DAGs:

-- maxtext_moe_tpu_e2e: 03:00 UTC → 02:45 UTC
-- maxtext_trillium_configs_perf: 9:45 UTC → 09:00 UTC
-- maxdiffusion_tpu_e2e: 13:45 UTC → 12:15 UTC
-- maxtext_convergence: 16:30 UTC → 15:15 UTC

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run one-shot tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed.

@RUEI4341 RUEI4341 changed the title Maxtext/user/dora/adjust dag schedule 14 1 Maxtext/user/dora/adjust_dag_schedule_14_1 Jan 6, 2026
@ooops678 ooops678 force-pushed the maxtext/user/dora/adjust_DAG_schedule_14_1 branch from 0dc53ac to 0407865 Compare January 6, 2026 09:02
@andrewyct andrewyct changed the title Maxtext/user/dora/adjust_dag_schedule_14_1 Adjust DAG schedule 14-1 Jan 6, 2026
@andrewyct andrewyct merged commit 68cd6ee into GoogleCloudPlatform:master Jan 6, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants