ci: add agent unit test workflow with auto-discovery and reporting by tarun-etikala · Pull Request #103 · red-hat-data-services/agentic-starter-kits

tarun-etikala · 2026-05-15T15:25:05Z

Description

Add a CI workflow (agent-tests.yml) that automatically discovers and runs agent unit tests on every PR and push to main. This closes the gap where broken agent code could merge without test validation.

Key features:

Auto-discovery — scans agents/*/*/tests/test_*.py to find testable agents. No workflow edits needed when new agents are added.
Path filtering on PRs — uses git diff to run only changed agents' tests. Push to main runs all agents (full regression).
Consolidated test report — mikepenz/action-junit-report produces a single "Agent Test Results" check with pass/fail counts and inline failure annotations on the PR diff.
Single job — discover + test loop + report in one job. Tests are fast (<1s per agent), so matrix overhead isn't justified.
All actions pinned to Node.js 24 SHAs (checkout v5.0.1, setup-python v6.2.0, setup-uv v7.6.0).

Also included:

Standardized all 9 agent Makefiles: make test now excludes tests/integration/ and tests/behavioral/ with $(PYTEST_ARGS) support for CI to inject --junitxml.
Fixed 5 broken unit tests across 3 agents:
- react_agent: stale assertion in test_dummy_web_search_return_format
- agentic_rag: 3 tests out of sync with refactored get_retriever_components (now reads VECTOR_STORE_ID from env); removed ad-hoc integration test
- openai_responses_agent: health test mock wasn't effective due to lifespan reference capture

Jira Ticket

RHAIENG-4065

Testing

make test passes (run from the affected agent directory)
Manual testing performed (describe steps below)
No testing required (documentation/config change only)

Ran make test for all 6 agents with unit tests locally — 5/6 pass (crewai fails only due to macOS ARM onnxruntime wheel, passes on Linux). Verified make test PYTEST_ARGS="--junitxml=results.xml -v" produces valid JUnit XML. Validated workflow with actionlint. Tested discover script locally with simulated multi-agent PR diffs.

Checklist

I have read CONTRIBUTING.md
No .env or secret files are included in this PR
All changes are within scope of the linked Jira ticket (if not, explain in Description)

Review Guidance

Start with .github/workflows/agent-tests.yml — the core of this PR. Single job with discover → loop → report.
Makefile changes are mechanical — same 2-line diff across all 9 agents (add --ignore flags + $(PYTEST_ARGS)).
Test fixes (agentic_rag/tests/test_tools.py, react_agent/tests/test_tools.py, openai_responses_agent/tests/test_health.py) — sync tests with current implementations.
After merge, configure branch protection to require "Unit Tests" as a status check.

Related PRs

None

coderabbitai · 2026-05-15T15:25:19Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e116a7c4-1805-48ac-beac-e277ec378de3

📥 Commits

Reviewing files that changed from the base of the PR and between 7847a6e and 2fabfa6.

📒 Files selected for processing (13)

.github/workflows/agent-tests.yml
agents/autogen/mcp_agent/Makefile
agents/crewai/websearch_agent/Makefile
agents/google/adk/Makefile
agents/langgraph/agentic_rag/Makefile
agents/langgraph/agentic_rag/tests/test_tools.py
agents/langgraph/human_in_the_loop/Makefile
agents/langgraph/react_agent/Makefile
agents/langgraph/react_agent/tests/test_tools.py
agents/langgraph/react_with_database_memory/Makefile
agents/llamaindex/websearch_agent/Makefile
agents/vanilla_python/openai_responses_agent/Makefile
agents/vanilla_python/openai_responses_agent/tests/test_health.py

✅ Files skipped from review due to trivial changes (2)

agents/langgraph/human_in_the_loop/Makefile
agents/langgraph/react_agent/tests/test_tools.py

🚧 Files skipped from review as they are similar to previous changes (8)

agents/langgraph/agentic_rag/Makefile
agents/langgraph/react_agent/Makefile
agents/llamaindex/websearch_agent/Makefile
agents/langgraph/react_with_database_memory/Makefile
agents/crewai/websearch_agent/Makefile
agents/vanilla_python/openai_responses_agent/tests/test_health.py
.github/workflows/agent-tests.yml
agents/langgraph/agentic_rag/tests/test_tools.py

📝 Walkthrough

Walkthrough

This PR introduces a GitHub Actions workflow for automated agent testing and standardizes test execution across 9 agents. Makefile test targets now exclude behavioral tests and support parameterized pytest arguments. Test logic is refactored in agentic_rag to use environment variable mocking instead of dotenv, and minor test assertions are updated in react_agent and openai_responses_agent.

Changes

Test Infrastructure and Agent Testing Standardization

Layer / File(s)	Summary
GitHub Actions Agent Test Workflow `.github/workflows/agent-tests.yml`	Defines a new workflow that runs when agents are modified on main or in PRs, discovers per-agent test directories, filters agents based on changed files for PRs, executes `make test` per selected agent with concurrency control, generates per-agent JUnit XML, and publishes an aggregated test report with conditional annotations.
Makefile Test Target Standardization `agents/autogen/mcp_agent/Makefile`, `agents/crewai/websearch_agent/Makefile`, `agents/google/adk/Makefile`, `agents/langgraph/agentic_rag/Makefile`, `agents/langgraph/human_in_the_loop/Makefile`, `agents/langgraph/react_agent/Makefile`, `agents/langgraph/react_with_database_memory/Makefile`, `agents/llamaindex/websearch_agent/Makefile`, `agents/vanilla_python/openai_responses_agent/Makefile`	All 9 agent `test` targets now exclude `tests/behavioral` in addition to `tests/integration`, and support optional `$(PYTEST_ARGS)` for extra pytest configuration, standardizing test invocation across the agent suite.
Test Assertion and Fixture Updates `agents/langgraph/react_agent/tests/test_tools.py`, `agents/vanilla_python/openai_responses_agent/tests/test_health.py`	React agent test assertion changed to expect "RedHat" in dummy web search results; openai_responses_agent health check fixture refactored to use direct monkeypatching of `main.get_agent` with `raise_server_exceptions=False` instead of `unittest.mock.patch`.
Agentic RAG Test Refactoring `agents/langgraph/agentic_rag/tests/test_tools.py`	Removes `python-dotenv` dependency and replaces environment loading with `getenv` mocking via `side_effect` maps providing `BASE_URL`, `VECTOR_STORE_ID`, and `API_KEY`. Initialization assertions now expect both `base_url` and `api_key` in `LlamaStackClient` construction. Error handling changed to raise `RuntimeError` when `VECTOR_STORE_ID` is missing, with assertion validating error message mentions `VECTOR_STORE_ID` and `load_documents.py`.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding a CI workflow for agent unit tests with auto-discovery and reporting capabilities.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, covering the workflow design, Makefile standardizations, test fixes, and testing validation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

agents/vanilla_python/openai_responses_agent/tests/test_health.py (1)

10-14: ⚡ Quick win

Use monkeypatch for guaranteed fixture cleanup.

Line 10–Line 14 manually override and restore main.get_agent; if setup errors occur before teardown, the patch can leak across tests. Prefer pytest’s monkeypatch for automatic restoration.

Proposed change

-@pytest.fixture
-def client():
+@pytest.fixture
+def client(monkeypatch):
     """Create a test client with the agent global set to a mock factory."""
     import main
 
-    original = main.get_agent
-    main.get_agent = lambda: None
+    monkeypatch.setattr(main, "get_agent", lambda: None)
     with TestClient(main.app, raise_server_exceptions=False) as c:
         yield c
-    main.get_agent = original

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py` around
lines 10 - 14, Replace the manual save/restore of main.get_agent in the test
fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of
assigning original = main.get_agent / main.get_agent = lambda: None /
main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda:
None) before creating the TestClient against main.app so the patched get_agent
is automatically restored after the test; update the fixture signature to accept
the monkeypatch fixture and remove the manual restore logic around TestClient.

agents/langgraph/agentic_rag/tests/test_tools.py (2)

247-261: ⚡ Quick win

Keep this test isolated from real client initialization

Line 248 removed the LlamaStackClient patch. If initialization order changes, this test can start constructing a real client and become brittle. Patch it and assert it is not called for the missing VECTOR_STORE_ID path.

Proposed hardening

+@patch("src.agentic_rag.tools.LlamaStackClient")
 `@patch`("src.agentic_rag.tools.getenv")
-def test_get_retriever_components_no_vector_store(mock_get_env):
+def test_get_retriever_components_no_vector_store(mock_get_env, mock_client_class):
     """Test error handling when VECTOR_STORE_ID env var is not set."""
@@
     def getenv_side_effect(key):
-        return {"BASE_URL": "http://localhost:8321"}.get(key)
+        return {
+            "BASE_URL": "http://localhost:8321",
+            "API_KEY": "test-key",
+        }.get(key)
@@
     with pytest.raises(RuntimeError) as exc_info:
         get_retriever_components()
+    mock_client_class.assert_not_called()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 247 - 261, The
test test_get_retriever_components_no_vector_store must remain isolated from
real client creation: re-add a patch for LlamaStackClient (or the constructor
used to build the client) so that no real client is instantiated, reset
tools_module._client_cache and tools_module._vector_store_id_cache as done, keep
getenv patched to return only BASE_URL, call get_retriever_components(), assert
it raises RuntimeError, and also assert the patched LlamaStackClient was never
called to ensure the missing VECTOR_STORE_ID path does not trigger client
initialization.

240-243: ⚡ Quick win

Exercise the /v1 normalization path explicitly

Line 240 states /v1 stripping is expected, but Line 238 passes a URL without /v1, so that branch is not validated.

Proposed test tweak

-    # Call with explicit base_url
-    result = get_retriever_components(base_url="http://custom:9999")
+    # Call with explicit base_url including /v1 suffix
+    result = get_retriever_components(base_url="http://custom:9999/v1")

     # Should use provided base_url (stripped of /v1 suffix if present)
     mock_client_class.assert_called_once_with(
         base_url="http://custom:9999", api_key="test-key"
     )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 240 - 243, The
test currently asserts base_url normalization but doesn't exercise the `/v1`
stripping branch; update the test in
agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls
mock_client_class.assert_called_once_with) to pass a base_url containing the
`/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the
code path that strips `/v1` is executed and the assertion still expects
base_url="http://custom:9999" and api_key="test-key".

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/agent-tests.yml:
- Line 95: The expression setting annotate_only incorrectly assumes
github.event.pull_request exists; update the job input to first check
github.event_name == 'pull_request' and only then compare
github.event.pull_request.head.repo.full_name != github.repository, otherwise
set annotate_only to false. Concretely, change the annotate_only value to a
conditional that uses github.event_name (e.g., github.event_name ==
'pull_request' && (github.event.pull_request.head.repo.full_name !=
github.repository)) so annotate_only is only computed for PR events and is false
for push/workflow_dispatch.

---

Nitpick comments:
In `@agents/langgraph/agentic_rag/tests/test_tools.py`:
- Around line 247-261: The test test_get_retriever_components_no_vector_store
must remain isolated from real client creation: re-add a patch for
LlamaStackClient (or the constructor used to build the client) so that no real
client is instantiated, reset tools_module._client_cache and
tools_module._vector_store_id_cache as done, keep getenv patched to return only
BASE_URL, call get_retriever_components(), assert it raises RuntimeError, and
also assert the patched LlamaStackClient was never called to ensure the missing
VECTOR_STORE_ID path does not trigger client initialization.
- Around line 240-243: The test currently asserts base_url normalization but
doesn't exercise the `/v1` stripping branch; update the test in
agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls
mock_client_class.assert_called_once_with) to pass a base_url containing the
`/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the
code path that strips `/v1` is executed and the assertion still expects
base_url="http://custom:9999" and api_key="test-key".

In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py`:
- Around line 10-14: Replace the manual save/restore of main.get_agent in the
test fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of
assigning original = main.get_agent / main.get_agent = lambda: None /
main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda:
None) before creating the TestClient against main.app so the patched get_agent
is automatically restored after the test; update the fixture signature to accept
the monkeypatch fixture and remove the manual restore logic around TestClient.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 94af0893-8d14-4373-a8ce-042178ecc16a

📥 Commits

Reviewing files that changed from the base of the PR and between a327361 and 6b5a90a.

📒 Files selected for processing (13)

.github/workflows/agent-tests.yml
agents/autogen/mcp_agent/Makefile
agents/crewai/websearch_agent/Makefile
agents/google/adk/Makefile
agents/langgraph/agentic_rag/Makefile
agents/langgraph/agentic_rag/tests/test_tools.py
agents/langgraph/human_in_the_loop/Makefile
agents/langgraph/react_agent/Makefile
agents/langgraph/react_agent/tests/test_tools.py
agents/langgraph/react_with_database_memory/Makefile
agents/llamaindex/websearch_agent/Makefile
agents/vanilla_python/openai_responses_agent/Makefile
agents/vanilla_python/openai_responses_agent/tests/test_health.py

Add a new agent-tests.yml workflow that runs unit tests on PRs and pushes to main. Key features: - Auto-discovers agents with tests (tests/test_*.py) — no workflow edits needed when new agents are added - On PRs, runs only changed agents' tests via git diff filtering - On push to main, runs all agents (full regression) - Produces a consolidated "Agent Test Results" check via mikepenz/action-junit-report with inline failure annotations - All actions pinned to Node.js 24 SHAs - Test result artifacts retained for 1 day only Also fixes broken unit tests across 3 agents (react_agent, agentic_rag, openai_responses_agent) and standardizes all 9 agent Makefiles to exclude integration/behavioral tests from make test, with $(PYTEST_ARGS) support for CI to inject --junitxml. Ref: RHAIENG-4065 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

andrewdonheiser

lgtm

github-actions Bot added area/langgraph area/crewai area/autogen area/llamaindex area/google-adk area/vanilla-python area/ci labels May 15, 2026

github-actions Bot added the size/m label May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread .github/workflows/agent-tests.yml Outdated

tarun-etikala force-pushed the feat/RHAIENG-4065-ci-agent-tests branch from 6b5a90a to 7847a6e Compare May 15, 2026 16:07

tarun-etikala self-assigned this May 15, 2026

tarun-etikala force-pushed the feat/RHAIENG-4065-ci-agent-tests branch from 7847a6e to 2fabfa6 Compare May 15, 2026 16:13

andrewdonheiser self-requested a review May 15, 2026 16:53

andrewdonheiser approved these changes May 15, 2026

View reviewed changes

tarun-etikala merged commit 237a0b5 into red-hat-data-services:main May 15, 2026
4 checks passed

tarun-etikala mentioned this pull request May 18, 2026

ci: add CODEOWNERS and fix Unit Tests check for required status checks #104

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add agent unit test workflow with auto-discovery and reporting#103

ci: add agent unit test workflow with auto-discovery and reporting#103
tarun-etikala merged 1 commit into
red-hat-data-services:mainfrom
tarun-etikala:feat/RHAIENG-4065-ci-agent-tests

tarun-etikala commented May 15, 2026

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

andrewdonheiser left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tarun-etikala commented May 15, 2026

Description

Jira Ticket

Testing

Checklist

Review Guidance

Related PRs

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrewdonheiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 15, 2026 •

edited

Loading