Implement EWA workflow and tool to parse TCMS Run results #302

martinky82 · 2025-10-22T13:47:26Z

it provides detailed output about the test statuses in thr TR. With the -f/--filter option it print only the esential intomation: test case status and if not PASS, then errata-resolution data from the notes

TODO:

Write new tests or update the old ones to cover new functionality.
Update doc-strings where appropriate.
Update or write new documentation in packit/packit.dev.
‹fill in›

Fixes

Related to

Merge before/after

RELEASE NOTES BEGIN

Packit now supports automatic ordering of ☕ after all checks pass.

RELEASE NOTES END

gemini-code-assist

Code Review

The pull request introduces a new tool, testrun_analyzer.py, designed to provide detailed output about test statuses in a TestRun (TR). It includes options to filter the output and exclude specific patterns. The tool uses the optparse module for argument parsing, regular expressions for pattern matching, and the nitrate module from the qe package to interact with TestRun data. The tool allows users to analyze test runs by ID and filter the output to show only essential information, such as test case status and errata-resolution data from notes, if the test case did not pass.

supervisor/tools/testrun_analyzer.py

owtaylor · 2025-10-22T20:31:22Z

OK, so is very interesting in showing what we can get out of TCMS. I tried it on one TCMS test run:
analysis-442101-no-filter.txt
analysis-442101-filter.txt

I feel like whether we fed the model the unfiltered or filtered form, we'd need to be very explicit about rules:

Look for lines  of the form "Old PASSED & New FAILED => REGRESSED". These
indicate new failures; you should return the test-failed status. On other hand,
a line "Old FAILED & New FAILED => BROKEN" indicates a broken test. These
can be ignored, but you should return the test-waived status and include
information about waived tests in the comment.

(Note that the regression line above is probably wrong I just pulled it out of thin air)

What I'd ideally like long term is something a bit different - what I'd like to do is present the results in a form that is structured, human readable, and also LLM readable - I think TOML would work well - something like:

[summary]
passed_count=100
regressed_count=2
fixed_count=4
broken_count=10
 
[[regressed]]
name="/Regression/replace-network-manager-patch-in-the-current-version"
arch="x86_64"
avc_check=true
url="https://src.fedoraproject.org/tests/frr.git"
ref="main"
path="Regression/replace-network-manager-patch-in-the-current-version"
old_logs="https://beaker-archive.prod.engineering.redhat.com/beaker-logs/2025/10/117487/11748745/19715109/203402372/taskout.log"
old_result="pass"
new_logs="https://beaker-archive.prod.engineering.redhat.com/beaker-logs/2025/10/213121/blah/blah/blah/taskout.log"
new_result="fail"

And have a common format we can use whether we're getting results from EWA or NEWA or whatever that would be hopefully complete enough to have an agent in the future that digs into failures, comes up with patches, etc.

I don't think we can get that level of detail out of TCMS, because the results have already been squished down into a quasi-human-readable form to put into the "notes" field - you'd have to dig out the recipe task IDs and go into beaker to find the details - and at that point, we might as well start from the beaker results already.

But rather than go down that route immediately, let's try keeping it simple and see if we can get the model going with something similar to what you have here.

Notes:

I don't think we should shell out to a cli tool, we should just have a tool implemented in Python that the LLM can use directly. However, for early development, you can have make your the file do something when run from the cli. There are leftovers from this strategy in errata_utils.py and jira_utils.py even though the functionality has moved on a lot from what was tested that way.

if __name__ == "__main__":
     print(format_tcms_run(int(sys.arg[1]))

My idea for the tool call would be "get_tcms_run_details" which takes a single run ID as input, and returns a string as the result - see supervisor/tools/read_issue.py for something that is very similar and can be adapted.
The only thing you are getting from qe.py is the nitrate import - we can just import nitrate directly - use uv add nitrate to add it to pyproject.toml, and we can add it to the Containerfile.supervisor as well (as an RPM if its in EPEL, otherwise via pip)
ANSI coloring is probably not useful for the model - it looks like you can 'nitrate.set_color_mode(nitrate.COLOR_OFF)
I would just always filter rather than making it a parameter for the tool call. (Or maybe use somewhat reduced filtering - still filter out the Errata Workflow stuff, but don't filter the notes?)

martinky82 · 2025-10-27T15:07:05Z

Thanks for the insights.

yes, the prompt needs to be specific about the different statuses. I can extend the script to print out Beaker links as well if that helps. Also, for BROKEN, LLM could look into the test logs and provide some kind of analysis of what is broken in the test. I tried that several times manually and the analysis was always useful.
TOML: yes, my intention is (in the longer run) to have unified output format so LLM can act upon it no matter what source/pipeline it comes from. So, generally I agree with you on the matter.
I'll refactor the tool so it can be used directly from Python code instead of runing it as a shell command
nitrate can be imported directly (python3-nitrate package) so qe.py is not needed afterall
setting colouring off is a good idea
yes, it makes sense to have the filering as a default (it can be default for the Python access and option for shell access)

mrge from upstream

owtaylor

It's great to have something working here!

Various comments about style and structure. Once you've fixed up these, please rebase everything to a single commit on top of upstream.

git rebase -i origin/main
# Edit it so everything after the first commit is 's' for squash
# Edit the commit message
git push --force martinky82 mkyral-ewa

supervisor/testing_analyst.py

supervisor/tools/ewa_tool.py

supervisor/tools/testrun_analyzer.py

supervisor/ewa_utils.py

supervisor/tools/testrun_analyzer.py

supervisor/ewa_utils.py

owtaylor

Generally looks good to land, one small disagreement with the instructions.

supervisor/testing_analyst.py

Co-authored-by: Owen Taylor <[email protected]>

gemini-code-assist bot reviewed Oct 22, 2025

View reviewed changes

martinky82 added 2 commits October 30, 2025 14:15

Merge pull request #1 from packit/main

120b6ec

mrge from upstream

Merge branch 'packit:main' into main

acb93ca

martinky82 changed the title ~~Tool to analyze test run passed by -r/--run~~ Implement EWA workflow and tool to parse TCMS Run results Nov 6, 2025

owtaylor requested changes Nov 10, 2025

View reviewed changes

Merge branch 'packit:main' into main

e3d7f01

martinky82 force-pushed the mkyral-ewa branch from b7b865b to 66297c9 Compare November 13, 2025 18:44

Tools to process TCMS Test Run created by EWA

8684414

martinky82 force-pushed the mkyral-ewa branch from 66297c9 to 8684414 Compare November 13, 2025 18:48

martinky82 requested a review from owtaylor November 14, 2025 09:51

owtaylor requested changes Nov 18, 2025

View reviewed changes

supervisor/testing_analyst.py Outdated Show resolved Hide resolved

Incorprate suggestion from review

6405aa1

Co-authored-by: Owen Taylor <[email protected]>

martinky82 requested a review from owtaylor December 1, 2025 15:24

Merge branch 'packit:main' into mkyral-ewa

00de08e

Implement EWA workflow and tool to parse TCMS Run results #302

Are you sure you want to change the base?

Implement EWA workflow and tool to parse TCMS Run results #302

Uh oh!

Conversation

martinky82 commented Oct 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

owtaylor commented Oct 22, 2025

Uh oh!

martinky82 commented Oct 27, 2025

Uh oh!

owtaylor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

owtaylor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants