Skip to content

ci: add agent unit test workflow with auto-discovery and reporting#103

Merged
tarun-etikala merged 1 commit into
red-hat-data-services:mainfrom
tarun-etikala:feat/RHAIENG-4065-ci-agent-tests
May 15, 2026
Merged

ci: add agent unit test workflow with auto-discovery and reporting#103
tarun-etikala merged 1 commit into
red-hat-data-services:mainfrom
tarun-etikala:feat/RHAIENG-4065-ci-agent-tests

Conversation

@tarun-etikala
Copy link
Copy Markdown
Contributor

Description

Add a CI workflow (agent-tests.yml) that automatically discovers and runs agent unit tests on every PR and push to main. This closes the gap where broken agent code could merge without test validation.

Key features:

  • Auto-discovery — scans agents/*/*/tests/test_*.py to find testable agents. No workflow edits needed when new agents are added.
  • Path filtering on PRs — uses git diff to run only changed agents' tests. Push to main runs all agents (full regression).
  • Consolidated test reportmikepenz/action-junit-report produces a single "Agent Test Results" check with pass/fail counts and inline failure annotations on the PR diff.
  • Single job — discover + test loop + report in one job. Tests are fast (<1s per agent), so matrix overhead isn't justified.
  • All actions pinned to Node.js 24 SHAs (checkout v5.0.1, setup-python v6.2.0, setup-uv v7.6.0).

Also included:

  • Standardized all 9 agent Makefiles: make test now excludes tests/integration/ and tests/behavioral/ with $(PYTEST_ARGS) support for CI to inject --junitxml.
  • Fixed 5 broken unit tests across 3 agents:
    • react_agent: stale assertion in test_dummy_web_search_return_format
    • agentic_rag: 3 tests out of sync with refactored get_retriever_components (now reads VECTOR_STORE_ID from env); removed ad-hoc integration test
    • openai_responses_agent: health test mock wasn't effective due to lifespan reference capture

Jira Ticket

RHAIENG-4065

Testing

  • make test passes (run from the affected agent directory)
  • Manual testing performed (describe steps below)
  • No testing required (documentation/config change only)

Ran make test for all 6 agents with unit tests locally — 5/6 pass (crewai fails only due to macOS ARM onnxruntime wheel, passes on Linux). Verified make test PYTEST_ARGS="--junitxml=results.xml -v" produces valid JUnit XML. Validated workflow with actionlint. Tested discover script locally with simulated multi-agent PR diffs.

Checklist

  • I have read CONTRIBUTING.md
  • No .env or secret files are included in this PR
  • All changes are within scope of the linked Jira ticket (if not, explain in Description)

Review Guidance

  • Start with .github/workflows/agent-tests.yml — the core of this PR. Single job with discover → loop → report.
  • Makefile changes are mechanical — same 2-line diff across all 9 agents (add --ignore flags + $(PYTEST_ARGS)).
  • Test fixes (agentic_rag/tests/test_tools.py, react_agent/tests/test_tools.py, openai_responses_agent/tests/test_health.py) — sync tests with current implementations.
  • After merge, configure branch protection to require "Unit Tests" as a status check.

Related PRs

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e116a7c4-1805-48ac-beac-e277ec378de3

📥 Commits

Reviewing files that changed from the base of the PR and between 7847a6e and 2fabfa6.

📒 Files selected for processing (13)
  • .github/workflows/agent-tests.yml
  • agents/autogen/mcp_agent/Makefile
  • agents/crewai/websearch_agent/Makefile
  • agents/google/adk/Makefile
  • agents/langgraph/agentic_rag/Makefile
  • agents/langgraph/agentic_rag/tests/test_tools.py
  • agents/langgraph/human_in_the_loop/Makefile
  • agents/langgraph/react_agent/Makefile
  • agents/langgraph/react_agent/tests/test_tools.py
  • agents/langgraph/react_with_database_memory/Makefile
  • agents/llamaindex/websearch_agent/Makefile
  • agents/vanilla_python/openai_responses_agent/Makefile
  • agents/vanilla_python/openai_responses_agent/tests/test_health.py
✅ Files skipped from review due to trivial changes (2)
  • agents/langgraph/human_in_the_loop/Makefile
  • agents/langgraph/react_agent/tests/test_tools.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • agents/langgraph/agentic_rag/Makefile
  • agents/langgraph/react_agent/Makefile
  • agents/llamaindex/websearch_agent/Makefile
  • agents/langgraph/react_with_database_memory/Makefile
  • agents/crewai/websearch_agent/Makefile
  • agents/vanilla_python/openai_responses_agent/tests/test_health.py
  • .github/workflows/agent-tests.yml
  • agents/langgraph/agentic_rag/tests/test_tools.py

📝 Walkthrough

Walkthrough

This PR introduces a GitHub Actions workflow for automated agent testing and standardizes test execution across 9 agents. Makefile test targets now exclude behavioral tests and support parameterized pytest arguments. Test logic is refactored in agentic_rag to use environment variable mocking instead of dotenv, and minor test assertions are updated in react_agent and openai_responses_agent.

Changes

Test Infrastructure and Agent Testing Standardization

Layer / File(s) Summary
GitHub Actions Agent Test Workflow
.github/workflows/agent-tests.yml
Defines a new workflow that runs when agents are modified on main or in PRs, discovers per-agent test directories, filters agents based on changed files for PRs, executes make test per selected agent with concurrency control, generates per-agent JUnit XML, and publishes an aggregated test report with conditional annotations.
Makefile Test Target Standardization
agents/autogen/mcp_agent/Makefile, agents/crewai/websearch_agent/Makefile, agents/google/adk/Makefile, agents/langgraph/agentic_rag/Makefile, agents/langgraph/human_in_the_loop/Makefile, agents/langgraph/react_agent/Makefile, agents/langgraph/react_with_database_memory/Makefile, agents/llamaindex/websearch_agent/Makefile, agents/vanilla_python/openai_responses_agent/Makefile
All 9 agent test targets now exclude tests/behavioral in addition to tests/integration, and support optional $(PYTEST_ARGS) for extra pytest configuration, standardizing test invocation across the agent suite.
Test Assertion and Fixture Updates
agents/langgraph/react_agent/tests/test_tools.py, agents/vanilla_python/openai_responses_agent/tests/test_health.py
React agent test assertion changed to expect "RedHat" in dummy web search results; openai_responses_agent health check fixture refactored to use direct monkeypatching of main.get_agent with raise_server_exceptions=False instead of unittest.mock.patch.
Agentic RAG Test Refactoring
agents/langgraph/agentic_rag/tests/test_tools.py
Removes python-dotenv dependency and replaces environment loading with getenv mocking via side_effect maps providing BASE_URL, VECTOR_STORE_ID, and API_KEY. Initialization assertions now expect both base_url and api_key in LlamaStackClient construction. Error handling changed to raise RuntimeError when VECTOR_STORE_ID is missing, with assertion validating error message mentions VECTOR_STORE_ID and load_documents.py.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a CI workflow for agent unit tests with auto-discovery and reporting capabilities.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering the workflow design, Makefile standardizations, test fixes, and testing validation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
agents/vanilla_python/openai_responses_agent/tests/test_health.py (1)

10-14: ⚡ Quick win

Use monkeypatch for guaranteed fixture cleanup.

Line 10–Line 14 manually override and restore main.get_agent; if setup errors occur before teardown, the patch can leak across tests. Prefer pytest’s monkeypatch for automatic restoration.

Proposed change
-@pytest.fixture
-def client():
+@pytest.fixture
+def client(monkeypatch):
     """Create a test client with the agent global set to a mock factory."""
     import main
 
-    original = main.get_agent
-    main.get_agent = lambda: None
+    monkeypatch.setattr(main, "get_agent", lambda: None)
     with TestClient(main.app, raise_server_exceptions=False) as c:
         yield c
-    main.get_agent = original
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py` around
lines 10 - 14, Replace the manual save/restore of main.get_agent in the test
fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of
assigning original = main.get_agent / main.get_agent = lambda: None /
main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda:
None) before creating the TestClient against main.app so the patched get_agent
is automatically restored after the test; update the fixture signature to accept
the monkeypatch fixture and remove the manual restore logic around TestClient.
agents/langgraph/agentic_rag/tests/test_tools.py (2)

247-261: ⚡ Quick win

Keep this test isolated from real client initialization

Line 248 removed the LlamaStackClient patch. If initialization order changes, this test can start constructing a real client and become brittle. Patch it and assert it is not called for the missing VECTOR_STORE_ID path.

Proposed hardening
+@patch("src.agentic_rag.tools.LlamaStackClient")
 `@patch`("src.agentic_rag.tools.getenv")
-def test_get_retriever_components_no_vector_store(mock_get_env):
+def test_get_retriever_components_no_vector_store(mock_get_env, mock_client_class):
     """Test error handling when VECTOR_STORE_ID env var is not set."""
@@
     def getenv_side_effect(key):
-        return {"BASE_URL": "http://localhost:8321"}.get(key)
+        return {
+            "BASE_URL": "http://localhost:8321",
+            "API_KEY": "test-key",
+        }.get(key)
@@
     with pytest.raises(RuntimeError) as exc_info:
         get_retriever_components()
+    mock_client_class.assert_not_called()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 247 - 261, The
test test_get_retriever_components_no_vector_store must remain isolated from
real client creation: re-add a patch for LlamaStackClient (or the constructor
used to build the client) so that no real client is instantiated, reset
tools_module._client_cache and tools_module._vector_store_id_cache as done, keep
getenv patched to return only BASE_URL, call get_retriever_components(), assert
it raises RuntimeError, and also assert the patched LlamaStackClient was never
called to ensure the missing VECTOR_STORE_ID path does not trigger client
initialization.

240-243: ⚡ Quick win

Exercise the /v1 normalization path explicitly

Line 240 states /v1 stripping is expected, but Line 238 passes a URL without /v1, so that branch is not validated.

Proposed test tweak
-    # Call with explicit base_url
-    result = get_retriever_components(base_url="http://custom:9999")
+    # Call with explicit base_url including /v1 suffix
+    result = get_retriever_components(base_url="http://custom:9999/v1")

     # Should use provided base_url (stripped of /v1 suffix if present)
     mock_client_class.assert_called_once_with(
         base_url="http://custom:9999", api_key="test-key"
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 240 - 243, The
test currently asserts base_url normalization but doesn't exercise the `/v1`
stripping branch; update the test in
agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls
mock_client_class.assert_called_once_with) to pass a base_url containing the
`/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the
code path that strips `/v1` is executed and the assertion still expects
base_url="http://custom:9999" and api_key="test-key".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/agent-tests.yml:
- Line 95: The expression setting annotate_only incorrectly assumes
github.event.pull_request exists; update the job input to first check
github.event_name == 'pull_request' and only then compare
github.event.pull_request.head.repo.full_name != github.repository, otherwise
set annotate_only to false. Concretely, change the annotate_only value to a
conditional that uses github.event_name (e.g., github.event_name ==
'pull_request' && (github.event.pull_request.head.repo.full_name !=
github.repository)) so annotate_only is only computed for PR events and is false
for push/workflow_dispatch.

---

Nitpick comments:
In `@agents/langgraph/agentic_rag/tests/test_tools.py`:
- Around line 247-261: The test test_get_retriever_components_no_vector_store
must remain isolated from real client creation: re-add a patch for
LlamaStackClient (or the constructor used to build the client) so that no real
client is instantiated, reset tools_module._client_cache and
tools_module._vector_store_id_cache as done, keep getenv patched to return only
BASE_URL, call get_retriever_components(), assert it raises RuntimeError, and
also assert the patched LlamaStackClient was never called to ensure the missing
VECTOR_STORE_ID path does not trigger client initialization.
- Around line 240-243: The test currently asserts base_url normalization but
doesn't exercise the `/v1` stripping branch; update the test in
agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls
mock_client_class.assert_called_once_with) to pass a base_url containing the
`/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the
code path that strips `/v1` is executed and the assertion still expects
base_url="http://custom:9999" and api_key="test-key".

In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py`:
- Around line 10-14: Replace the manual save/restore of main.get_agent in the
test fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of
assigning original = main.get_agent / main.get_agent = lambda: None /
main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda:
None) before creating the TestClient against main.app so the patched get_agent
is automatically restored after the test; update the fixture signature to accept
the monkeypatch fixture and remove the manual restore logic around TestClient.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 94af0893-8d14-4373-a8ce-042178ecc16a

📥 Commits

Reviewing files that changed from the base of the PR and between a327361 and 6b5a90a.

📒 Files selected for processing (13)
  • .github/workflows/agent-tests.yml
  • agents/autogen/mcp_agent/Makefile
  • agents/crewai/websearch_agent/Makefile
  • agents/google/adk/Makefile
  • agents/langgraph/agentic_rag/Makefile
  • agents/langgraph/agentic_rag/tests/test_tools.py
  • agents/langgraph/human_in_the_loop/Makefile
  • agents/langgraph/react_agent/Makefile
  • agents/langgraph/react_agent/tests/test_tools.py
  • agents/langgraph/react_with_database_memory/Makefile
  • agents/llamaindex/websearch_agent/Makefile
  • agents/vanilla_python/openai_responses_agent/Makefile
  • agents/vanilla_python/openai_responses_agent/tests/test_health.py

Comment thread .github/workflows/agent-tests.yml Outdated
@tarun-etikala tarun-etikala force-pushed the feat/RHAIENG-4065-ci-agent-tests branch from 6b5a90a to 7847a6e Compare May 15, 2026 16:07
@tarun-etikala tarun-etikala self-assigned this May 15, 2026
Add a new agent-tests.yml workflow that runs unit tests on PRs and
pushes to main. Key features:

- Auto-discovers agents with tests (tests/test_*.py) — no workflow
  edits needed when new agents are added
- On PRs, runs only changed agents' tests via git diff filtering
- On push to main, runs all agents (full regression)
- Produces a consolidated "Agent Test Results" check via
  mikepenz/action-junit-report with inline failure annotations
- All actions pinned to Node.js 24 SHAs
- Test result artifacts retained for 1 day only

Also fixes broken unit tests across 3 agents (react_agent,
agentic_rag, openai_responses_agent) and standardizes all 9 agent
Makefiles to exclude integration/behavioral tests from make test,
with $(PYTEST_ARGS) support for CI to inject --junitxml.

Ref: RHAIENG-4065

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tarun-etikala tarun-etikala force-pushed the feat/RHAIENG-4065-ci-agent-tests branch from 7847a6e to 2fabfa6 Compare May 15, 2026 16:13
@andrewdonheiser andrewdonheiser self-requested a review May 15, 2026 16:53
Copy link
Copy Markdown
Contributor

@andrewdonheiser andrewdonheiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tarun-etikala tarun-etikala merged commit 237a0b5 into red-hat-data-services:main May 15, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants