-
Notifications
You must be signed in to change notification settings - Fork 3.9k
feature: improvements to CI failure monitor #16272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @dougyster, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the CI failure monitoring system by introducing the capability to analyze failures at the individual test file level. It enhances the existing job-level failure tracking with more detailed insights, allowing developers to quickly pinpoint the root cause of CI instability. The changes include fetching job logs, parsing test summaries, and presenting this granular information in an improved, interactive GitHub summary report, making it easier to identify and address persistent test failures. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request significantly enhances the CI failure monitor by introducing test-level failure analysis. It adds functionality to fetch and parse job logs to identify specific failing tests within a job, track their failure streaks, and report them in a new collapsible section in the GitHub Actions summary. The reporting has been refactored from console output to a more detailed and user-friendly markdown format. The changes are well-structured, but there are a few opportunities to improve code quality by reducing duplication and adhering to Python best practices, which I've detailed in my comments.
| Returns: | ||
| Dict with passed/total counts and list of failed tests, or None if no summary found. | ||
| """ | ||
| import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better code organization and to adhere to standard Python style (PEP 8), module imports should be at the top of the file. Placing import re inside the function can lead to repeated, unnecessary imports if the function is called multiple times. Please move it to the top of the file with the other imports.
| for match in re.finditer(r"(\S+\.py)", failed_section): | ||
| full_path = match.group(1) | ||
| # Extract just the filename from the path | ||
| test_file = full_path.split("/")[-1] if "/" in full_path else full_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using os.path.basename() is a more idiomatic and robust way to extract a filename from a path in Python compared to manual string splitting. It handles different path separators (e.g., / and \) automatically, making the code more portable. The os module is already imported in this file.
| test_file = full_path.split("/")[-1] if "/" in full_path else full_path | |
| test_file = os.path.basename(full_path) |
| Args: | ||
| recent_runs: List of recent run info dicts with job_id, job_url, conclusion, etc. | ||
| debug: Enable debug logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| test_failures[test_file]["recent_runs"].append( | ||
| { | ||
| "run_number": run_info.get("run_number"), | ||
| "job_url": run_info.get("job_url"), | ||
| "status": "❌", | ||
| "failed": True, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dictionary for a recent_runs entry is created in multiple places with the same structure (e.g., lines 253-260, 264-271, etc.). To improve maintainability and reduce code duplication, consider creating a private helper method to construct this dictionary.
For example:
def _create_run_entry(self, run_info, status, failed):
return {
"run_number": run_info.get("run_number"),
"job_url": run_info.get("job_url"),
"status": status,
"failed": failed,
}You could then call it like self._create_run_entry(run_info, "❌", True).
| else: | ||
| # Other conclusion (cancelled, skipped, etc.) - don't reset streaks, mark as unknown | ||
| for test_file in test_failures.keys(): | ||
| test_failures[test_file]["recent_runs"].append( | ||
| { | ||
| "run_number": run_info.get("run_number"), | ||
| "job_url": run_info.get("job_url"), | ||
| "status": "⚪", | ||
| "failed": None, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else block, which handles conclusions like 'cancelled' or 'skipped', is identical to the else block on lines 261-271 that handles a job failure where the test summary could not be parsed. This code duplication can be avoided by restructuring the conditional logic to group all cases that should result in an "unknown" status.
Motivation
Improve the quality of the failure monitor by showing test level failures for each job failure, as well as other fixes and design improvements.
Modifications
ci_failures_analysis.py
Accuracy Tests
(https://github.com/sgl-project/sglang/actions/runs/20641823449)
Benchmarking and Profiling
N/A
Checklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci) or contact authorized users to do so.