Skip to content

Commit 916e0df

Browse files
sutaakarclaude
authored andcommitted
Fix flaky failure e2e test: increase training duration for reliable polling
The failing-test-runtime training ran only 3 seconds (15 steps × 0.2s), which was too short for the controller's 2s poll interval to capture progress > 0 before the job crashed. Increased per-step sleep to 0.5s (~8s total) so the controller has 3-4 poll cycles to capture progress. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c2c1ebc commit 916e0df

1 file changed

Lines changed: 5 additions & 3 deletions

File tree

test/e2e/rhai/resources/failing-test-runtime.yaml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,15 +66,17 @@ spec:
6666
# Wait briefly for server to be ready
6767
time.sleep(1)
6868
69-
# Fast training that will fail at 50% (3 seconds total)
69+
# Training that will fail at 50% (~8 seconds total)
70+
# Must run long enough for controller to poll progress > 0
71+
# with a 2s poll interval (at least 3-4 poll cycles).
7072
print("Starting training that will fail...")
7173
total_steps = 30
7274
fail_at_step = 15 # Fail at 50%
7375
7476
for step in range(fail_at_step):
75-
time.sleep(0.2) # 0.2s per step
77+
time.sleep(0.5) # 0.5s per step
7678
progress = int((step / total_steps) * 100)
77-
remaining = int((total_steps - step) * 0.2)
79+
remaining = int((total_steps - step) * 0.5)
7880
7981
MetricsHandler.progress_data = {
8082
"progressPercentage": progress,

0 commit comments

Comments
 (0)