Fixed analysis when gather-extra fails in post phase#118
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughWhen ChangesError analysis fallback prioritization
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| with open(build_file_path, "r", errors="replace", encoding="utf-8") as f: | ||
| build_log_content = list(deque(f, maxlen=BUILD_LOG_TAIL)) | ||
| # check if there are orion results first, as the actual test may have performance issues | ||
| orion_preview, orion_full, cp_tests = scan_orion_jsons(directory_path) |
There was a problem hiding this comment.
Orion scan here is not relevant. we just need to verify if the failure phase is post and then if it is classified under existing maintenance bucket, our error message should be
gather-extra runs at the end of the pipeline. Appears to be general workload failure, but we have performance results to look further.
Instead of
This appears to be an installation or maintenance issue. Please re-trigger the run.
There was a problem hiding this comment.
Ok pushed new changes
| ) | ||
| # No orion results, so fall back to build log analysis (which triggers LLM analysis) | ||
| return ProwAnalysisResult( | ||
| errors=[ |
There was a problem hiding this comment.
Why are we removing this?
There was a problem hiding this comment.
For better error reporting (that's why I set requires_llm=True)
| ) | ||
| cluster_operators_file_path = os.path.join(directory_path, "clusteroperators.json") | ||
| if not os.path.isfile(cluster_operators_file_path): | ||
| with open(build_file_path, "r", errors="replace", encoding="utf-8") as f: |
There was a problem hiding this comment.
Also why is this being skipped?
There was a problem hiding this comment.
I removed because we are using orion
Description
When gather-extra fails in the post phase, it doesn't create clusteroperators.json. BugZooka would just return a generic error with no deeper analysis.
Now, we first check if orion performance data exists. If not, we get relevant errors from build logs and then trigger LLM analysis of those errors.
Error link: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-payload-control-plane-6nodes/2064660673337495552