Skip to content

Conversation

@chandrasekaranpradeep
Copy link
Contributor

@chandrasekaranpradeep chandrasekaranpradeep commented Nov 18, 2025

Summary

In the nightly pipeline, several model tests (both passing and xfail) intermittently crash. However, when these jobs are retriggered manually, the crashed tests often pass on the second attempt. When multiple nightly jobs contain crashed cases, each job must currently be retriggered individually, which is time-consuming and inefficient.

What This PR Introduces

  • A new workflow that automatically:
    • Collects and returns all crashed test cases from a given workflow run using the run ID and artifact logs.
    • Calculates how many runners are required based on the number of crashed cases (5 cases per runner/job), and returns both the crashed test count and the crashed test IDs.
    • Provides a boolean flag indicating whether the workflow run contains any crashed cases, allowing downstream jobs to safely handle the "no crashes" scenario.

After collecting the crashed cases, a separate job uses this information to rerun only the crashed test cases, improving CI efficiency and avoiding unnecessary full-job retriggers.

  • The crash-collection logic previously implemented only in the model analysis pipeline has now been unified and extracted into a reusable workflow, allowing both the nightly pipeline and the model analysis pipeline to use the same mechanism.

Note

Tested and verified the feature in both nightly and model analysis pipeline

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.55%. Comparing base (e16c140) to head (959047d).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3047   +/-   ##
=======================================
  Coverage   63.55%   63.55%           
=======================================
  Files         156      156           
  Lines       11908    11908           
=======================================
  Hits         7568     7568           
  Misses       4340     4340           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chandrasekaranpradeep chandrasekaranpradeep force-pushed the pchandrasekaran/enhance_crashed_cases branch 11 times, most recently from 0ff7b46 to a3a10bf Compare November 21, 2025 05:14
@chandrasekaranpradeep chandrasekaranpradeep marked this pull request as ready for review November 21, 2025 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants