Skip to content

Conversation

@martinky82
Copy link

it provides detailed output about the test statuses in thr TR. With the -f/--filter option it print only the esential intomation: test case status and if not PASS, then errata-resolution data from the notes

TODO:

  • Write new tests or update the old ones to cover new functionality.
  • Update doc-strings where appropriate.
  • Update or write new documentation in packit/packit.dev.
  • ‹fill in›

Fixes

Related to

Merge before/after

RELEASE NOTES BEGIN

Packit now supports automatic ordering of ☕ after all checks pass.

RELEASE NOTES END

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new tool, testrun_analyzer.py, designed to provide detailed output about test statuses in a TestRun (TR). It includes options to filter the output and exclude specific patterns. The tool uses the optparse module for argument parsing, regular expressions for pattern matching, and the nitrate module from the qe package to interact with TestRun data. The tool allows users to analyze test runs by ID and filter the output to show only essential information, such as test case status and errata-resolution data from notes, if the test case did not pass.

@owtaylor
Copy link
Collaborator

OK, so is very interesting in showing what we can get out of TCMS. I tried it on one TCMS test run:
analysis-442101-no-filter.txt
analysis-442101-filter.txt

I feel like whether we fed the model the unfiltered or filtered form, we'd need to be very explicit about rules:

Look for lines  of the form "Old PASSED & New FAILED => REGRESSED". These
indicate new failures; you should return the test-failed status. On other hand,
a line "Old FAILED & New FAILED => BROKEN" indicates a broken test. These
can be ignored, but you should return the test-waived status and include
information about waived tests in the comment.

(Note that the regression line above is probably wrong I just pulled it out of thin air)

What I'd ideally like long term is something a bit different - what I'd like to do is present the results in a form that is structured, human readable, and also LLM readable - I think TOML would work well - something like:

[summary]
passed_count=100
regressed_count=2
fixed_count=4
broken_count=10
 
[[regressed]]
name="/Regression/replace-network-manager-patch-in-the-current-version"
arch="x86_64"
avc_check=true
url="https://src.fedoraproject.org/tests/frr.git"
ref="main"
path="Regression/replace-network-manager-patch-in-the-current-version"
old_logs="https://beaker-archive.prod.engineering.redhat.com/beaker-logs/2025/10/117487/11748745/19715109/203402372/taskout.log"
old_result="pass"
new_logs="https://beaker-archive.prod.engineering.redhat.com/beaker-logs/2025/10/213121/blah/blah/blah/taskout.log"
new_result="fail"

And have a common format we can use whether we're getting results from EWA or NEWA or whatever that would be hopefully complete enough to have an agent in the future that digs into failures, comes up with patches, etc.

I don't think we can get that level of detail out of TCMS, because the results have already been squished down into a quasi-human-readable form to put into the "notes" field - you'd have to dig out the recipe task IDs and go into beaker to find the details - and at that point, we might as well start from the beaker results already.

But rather than go down that route immediately, let's try keeping it simple and see if we can get the model going with something similar to what you have here.

Notes:

  • I don't think we should shell out to a cli tool, we should just have a tool implemented in Python that the LLM can use directly. However, for early development, you can have make your the file do something when run from the cli. There are leftovers from this strategy in errata_utils.py and jira_utils.py even though the functionality has moved on a lot from what was tested that way.
if __name__ == "__main__":
     print(format_tcms_run(int(sys.arg[1]))
  • My idea for the tool call would be "get_tcms_run_details" which takes a single run ID as input, and returns a string as the result - see supervisor/tools/read_issue.py for something that is very similar and can be adapted.
  • The only thing you are getting from qe.py is the nitrate import - we can just import nitrate directly - use uv add nitrate to add it to pyproject.toml, and we can add it to the Containerfile.supervisor as well (as an RPM if its in EPEL, otherwise via pip)
  • ANSI coloring is probably not useful for the model - it looks like you can 'nitrate.set_color_mode(nitrate.COLOR_OFF)
  • I would just always filter rather than making it a parameter for the tool call. (Or maybe use somewhat reduced filtering - still filter out the Errata Workflow stuff, but don't filter the notes?)

@martinky82
Copy link
Author

Thanks for the insights.

  1. yes, the prompt needs to be specific about the different statuses. I can extend the script to print out Beaker links as well if that helps. Also, for BROKEN, LLM could look into the test logs and provide some kind of analysis of what is broken in the test. I tried that several times manually and the analysis was always useful.

  2. TOML: yes, my intention is (in the longer run) to have unified output format so LLM can act upon it no matter what source/pipeline it comes from. So, generally I agree with you on the matter.

  3. I'll refactor the tool so it can be used directly from Python code instead of runing it as a shell command

  4. nitrate can be imported directly (python3-nitrate package) so qe.py is not needed afterall

  5. setting colouring off is a good idea

  6. yes, it makes sense to have the filering as a default (it can be default for the Python access and option for shell access)

@martinky82 martinky82 changed the title Tool to analyze test run passed by -r/--run Implement EWA workflow and tool to parse TCMS Run results Nov 6, 2025
Copy link
Collaborator

@owtaylor owtaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to have something working here!

Various comments about style and structure. Once you've fixed up these, please rebase everything to a single commit on top of upstream.

git rebase -i origin/main
# Edit it so everything after the first commit is 's' for squash
# Edit the commit message
git push --force martinky82 mkyral-ewa

Copy link
Collaborator

@owtaylor owtaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to land, one small disagreement with the instructions.

@martinky82 martinky82 requested a review from owtaylor December 1, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants