-
Notifications
You must be signed in to change notification settings - Fork 22
Add Triage agent testsuite #367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
a31f3f6
Add documentation about MOCK_JIRA env var to agents README
TomasKorbar 6fa3dd7
Fix imports of models in triage and backport agent
TomasKorbar 0c65fd7
Refactor run_workflow function to make it accessible outside of module
TomasKorbar 8d3ac68
Add triage agent test suite
TomasKorbar fb6fb25
Add triage_agent_factory as argument into run_workflow of triage agent
TomasKorbar e93e513
Format test_triage with black
TomasKorbar 15629a7
Add type hints where appropriate
TomasKorbar c74d0f1
Fix DRY_RUN application in add_jira_comment tool
TomasKorbar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,6 +19,23 @@ Three agents process tasks through Redis queues: | |
| - **Modify JIRA issues** (add comments, update fields, apply labels) | ||
| - **Create GitLab merge requests** and push commits | ||
|
|
||
| ## Jira mocking | ||
|
|
||
| If you clone testing Jira files from | ||
| `[email protected]:jotnar-project/testing-jiras.git` | ||
| you can use them to work with instead of real Jira server. | ||
|
|
||
| Example: | ||
|
|
||
| `make run-triage-agent-standalone JIRA_ISSUE=RHEL-15216 MOCK_JIRA=true` | ||
|
|
||
| If used together with `DRY_RUN`, the agents won't edit the Jira files, | ||
| otherwise they will. | ||
|
|
||
| Example: | ||
|
|
||
| `make run-triage-agent-standalone JIRA_ISSUE=RHEL-15216 DRY_RUN=true MOCK_JIRA=true` | ||
|
|
||
| ## Setup | ||
|
|
||
| ### Required API Tokens & Authentication | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| from datetime import datetime | ||
|
|
||
| from beeai_framework.context import ( | ||
| RunContextStartEvent, | ||
| RunContextFinishEvent, | ||
| RunMiddlewareProtocol, | ||
| RunContext | ||
| ) | ||
| from beeai_framework.emitter import EmitterOptions, EventMeta | ||
| from beeai_framework.emitter.utils import create_internal_event_matcher | ||
|
|
||
|
|
||
| class MetricsMiddleware(RunMiddlewareProtocol): | ||
| def __init__(self) -> None: | ||
| self.start_time: datetime | None = None | ||
| self.end_time: datetime | None = None | ||
| self.tool_calls: int = 0 | ||
|
|
||
| def bind(self, ctx: RunContext) -> None: | ||
| ctx.emitter.on( | ||
| create_internal_event_matcher("start", ctx.instance), | ||
| self._on_run_context_start, | ||
| EmitterOptions(is_blocking=True, priority=1), | ||
| ) | ||
| ctx.emitter.on( | ||
| create_internal_event_matcher("finish", ctx.instance), | ||
| self._on_run_context_finish, | ||
| EmitterOptions(is_blocking=True, priority=1), | ||
| ) | ||
|
|
||
| async def _on_run_context_start(self, event: RunContextStartEvent, meta: EventMeta) -> None: | ||
| self.start_time = datetime.now() | ||
|
|
||
| async def _on_run_context_finish(self, event: RunContextFinishEvent, meta: EventMeta) -> None: | ||
| self.end_time = datetime.now() | ||
|
|
||
| @property | ||
| def duration(self) -> float: | ||
| if self.start_time and self.end_time: | ||
| return (self.end_time - self.start_time).total_seconds() | ||
| return 0 | ||
|
|
||
| def get_metrics(self) -> dict[str, float]: | ||
| return {"duration": self.duration} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| from typing import Generator | ||
|
|
||
|
|
||
| import pytest | ||
|
|
||
|
|
||
| @pytest.hookimpl(wrapper=True) | ||
| def pytest_terminal_summary( | ||
| terminalreporter: pytest.TerminalReporter, exitstatus, config: pytest.Config | ||
| ) -> Generator: | ||
| yield | ||
| metrics = config.stash.get("metrics", None) | ||
|
|
||
| if metrics: | ||
| terminalreporter.write_sep("=", "Metrics") | ||
| terminalreporter.write_line(metrics, flush=True) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| from tabulate import tabulate | ||
| import pytest | ||
| import os | ||
|
|
||
| from agents.triage_agent import run_workflow, TriageState, create_triage_agent | ||
| from agents.metrics_middleware import MetricsMiddleware | ||
| from agents.observability import setup_observability | ||
| from common.models import TriageOutputSchema, Resolution, BackportData | ||
|
|
||
|
|
||
| class TriageAgentTestCase: | ||
| def __init__(self, input: str, expected_output: TriageOutputSchema): | ||
| self.input: str = input | ||
| self.expected_output: TriageOutputSchema = expected_output | ||
| self.metrics: dict = None | ||
|
|
||
| async def run(self) -> TriageState: | ||
| metrics_middleware = MetricsMiddleware() | ||
|
|
||
| def testing_factory(gateway_tools): | ||
| triage_agent = create_triage_agent(gateway_tools) | ||
| triage_agent.middlewares.append(metrics_middleware) | ||
| return triage_agent | ||
|
|
||
| finished_state = await run_workflow(self.input, False, testing_factory) | ||
| self.metrics = metrics_middleware.get_metrics() | ||
| return finished_state | ||
|
|
||
|
|
||
| test_cases = [ | ||
| TriageAgentTestCase( | ||
| input="RHEL-15216", | ||
| expected_output=TriageOutputSchema( | ||
| resolution=Resolution.BACKPORT, | ||
| data=BackportData( | ||
| package="dnsmasq", | ||
| patch_urls=[ | ||
| "http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=patch;h=dd33e98da09c487a58b6cb6693b8628c0b234a3b" | ||
| ], | ||
| justification="not-implemented", | ||
| jira_issue="RHEL-15216", | ||
| cve_id=None, | ||
| fix_version="rhel-8.10", | ||
| ), | ||
| ), | ||
| ), | ||
| TriageAgentTestCase( | ||
| input="RHEL-112546", | ||
| expected_output=TriageOutputSchema( | ||
| resolution=Resolution.BACKPORT, | ||
| data=BackportData( | ||
| package="libtiff", | ||
| patch_urls=[ | ||
| "https://gitlab.com/libtiff/libtiff/-/commit/d1c0719e004fbb223c571d286c73911569d4dbb6.patch" | ||
| ], | ||
| justification="not-implemented", | ||
| jira_issue="RHEL-112546", | ||
| cve_id="CVE-2025-9900", | ||
| fix_version="rhel-9.6.z", | ||
| ), | ||
| ), | ||
| ), | ||
| TriageAgentTestCase( | ||
| input="RHEL-61943", | ||
| expected_output=TriageOutputSchema( | ||
| resolution=Resolution.BACKPORT, | ||
| data=BackportData( | ||
| package="dnsmasq", | ||
| patch_urls=[ | ||
| "http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=patch;h=eb1fe15ca80b6bc43cd6bfdf309ec6c590aff811" | ||
| ], | ||
| justification="not-implemented", | ||
| jira_issue="RHEL-61943", | ||
| cve_id=None, | ||
| fix_version="rhel-8.10.z", | ||
| ), | ||
| ), | ||
| ), | ||
| TriageAgentTestCase( | ||
| input="RHEL-29712", | ||
| expected_output=TriageOutputSchema( | ||
| resolution=Resolution.BACKPORT, | ||
| data=BackportData( | ||
| package="bind", | ||
| patch_urls=[ | ||
| "https://gitlab.isc.org/isc-projects/bind9/-/commit/7e2f50c36958f8c98d54e6d131f088a4837ce269" | ||
| ], | ||
| justification="not-implemented", | ||
| jira_issue="RHEL-29712", | ||
| cve_id=None, | ||
| fix_version="rhel-8.10.z", | ||
| ), | ||
| ), | ||
| ), | ||
| ] | ||
|
|
||
|
|
||
| @pytest.fixture(scope="session", autouse=True) | ||
| def observability_fixture(): | ||
| return setup_observability(os.environ["COLLECTOR_ENDPOINT"]) | ||
TomasKorbar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| @pytest.fixture(scope="session", autouse=True) | ||
| def mydata(request): | ||
| yield | ||
| collected_metrics = [] | ||
| for test_case in test_cases: | ||
| if test_case.metrics is None: | ||
| continue | ||
| collected_metrics.append([test_case.input] + list(test_case.metrics.values())) | ||
| request.config.stash["metrics"] = tabulate(collected_metrics, ["Issue", "Time"]) | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| @pytest.mark.parametrize( | ||
| "test_case", | ||
| test_cases, | ||
| ) | ||
| async def test_triage_agent(test_case: TriageAgentTestCase): | ||
| def verify_result( | ||
| real_output: TriageOutputSchema, expected_output: TriageOutputSchema | ||
| ): | ||
| assert real_output.resolution == expected_output.resolution | ||
| assert real_output.data.package == expected_output.data.package | ||
| assert real_output.data.patch_urls == expected_output.data.patch_urls | ||
| assert real_output.data.jira_issue == expected_output.data.jira_issue | ||
| assert real_output.data.cve_id == expected_output.data.cve_id | ||
| assert real_output.data.fix_version == expected_output.data.fix_version | ||
TomasKorbar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| finished_state = await test_case.run() | ||
| verify_result(finished_state.triage_result, test_case.expected_output) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.