ci: add agent unit test workflow with auto-discovery and reporting#103
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (13)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (8)
📝 WalkthroughWalkthroughThis PR introduces a GitHub Actions workflow for automated agent testing and standardizes test execution across 9 agents. Makefile test targets now exclude behavioral tests and support parameterized pytest arguments. Test logic is refactored in agentic_rag to use environment variable mocking instead of dotenv, and minor test assertions are updated in react_agent and openai_responses_agent. ChangesTest Infrastructure and Agent Testing Standardization
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
agents/vanilla_python/openai_responses_agent/tests/test_health.py (1)
10-14: ⚡ Quick winUse
monkeypatchfor guaranteed fixture cleanup.Line 10–Line 14 manually override and restore
main.get_agent; if setup errors occur before teardown, the patch can leak across tests. Preferpytest’smonkeypatchfor automatic restoration.Proposed change
-@pytest.fixture -def client(): +@pytest.fixture +def client(monkeypatch): """Create a test client with the agent global set to a mock factory.""" import main - original = main.get_agent - main.get_agent = lambda: None + monkeypatch.setattr(main, "get_agent", lambda: None) with TestClient(main.app, raise_server_exceptions=False) as c: yield c - main.get_agent = original🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py` around lines 10 - 14, Replace the manual save/restore of main.get_agent in the test fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of assigning original = main.get_agent / main.get_agent = lambda: None / main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda: None) before creating the TestClient against main.app so the patched get_agent is automatically restored after the test; update the fixture signature to accept the monkeypatch fixture and remove the manual restore logic around TestClient.agents/langgraph/agentic_rag/tests/test_tools.py (2)
247-261: ⚡ Quick winKeep this test isolated from real client initialization
Line 248 removed the
LlamaStackClientpatch. If initialization order changes, this test can start constructing a real client and become brittle. Patch it and assert it is not called for the missingVECTOR_STORE_IDpath.Proposed hardening
+@patch("src.agentic_rag.tools.LlamaStackClient") `@patch`("src.agentic_rag.tools.getenv") -def test_get_retriever_components_no_vector_store(mock_get_env): +def test_get_retriever_components_no_vector_store(mock_get_env, mock_client_class): """Test error handling when VECTOR_STORE_ID env var is not set.""" @@ def getenv_side_effect(key): - return {"BASE_URL": "http://localhost:8321"}.get(key) + return { + "BASE_URL": "http://localhost:8321", + "API_KEY": "test-key", + }.get(key) @@ with pytest.raises(RuntimeError) as exc_info: get_retriever_components() + mock_client_class.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 247 - 261, The test test_get_retriever_components_no_vector_store must remain isolated from real client creation: re-add a patch for LlamaStackClient (or the constructor used to build the client) so that no real client is instantiated, reset tools_module._client_cache and tools_module._vector_store_id_cache as done, keep getenv patched to return only BASE_URL, call get_retriever_components(), assert it raises RuntimeError, and also assert the patched LlamaStackClient was never called to ensure the missing VECTOR_STORE_ID path does not trigger client initialization.
240-243: ⚡ Quick winExercise the
/v1normalization path explicitlyLine 240 states
/v1stripping is expected, but Line 238 passes a URL without/v1, so that branch is not validated.Proposed test tweak
- # Call with explicit base_url - result = get_retriever_components(base_url="http://custom:9999") + # Call with explicit base_url including /v1 suffix + result = get_retriever_components(base_url="http://custom:9999/v1") # Should use provided base_url (stripped of /v1 suffix if present) mock_client_class.assert_called_once_with( base_url="http://custom:9999", api_key="test-key" )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/langgraph/agentic_rag/tests/test_tools.py` around lines 240 - 243, The test currently asserts base_url normalization but doesn't exercise the `/v1` stripping branch; update the test in agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls mock_client_class.assert_called_once_with) to pass a base_url containing the `/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the code path that strips `/v1` is executed and the assertion still expects base_url="http://custom:9999" and api_key="test-key".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/agent-tests.yml:
- Line 95: The expression setting annotate_only incorrectly assumes
github.event.pull_request exists; update the job input to first check
github.event_name == 'pull_request' and only then compare
github.event.pull_request.head.repo.full_name != github.repository, otherwise
set annotate_only to false. Concretely, change the annotate_only value to a
conditional that uses github.event_name (e.g., github.event_name ==
'pull_request' && (github.event.pull_request.head.repo.full_name !=
github.repository)) so annotate_only is only computed for PR events and is false
for push/workflow_dispatch.
---
Nitpick comments:
In `@agents/langgraph/agentic_rag/tests/test_tools.py`:
- Around line 247-261: The test test_get_retriever_components_no_vector_store
must remain isolated from real client creation: re-add a patch for
LlamaStackClient (or the constructor used to build the client) so that no real
client is instantiated, reset tools_module._client_cache and
tools_module._vector_store_id_cache as done, keep getenv patched to return only
BASE_URL, call get_retriever_components(), assert it raises RuntimeError, and
also assert the patched LlamaStackClient was never called to ensure the missing
VECTOR_STORE_ID path does not trigger client initialization.
- Around line 240-243: The test currently asserts base_url normalization but
doesn't exercise the `/v1` stripping branch; update the test in
agents/langgraph/agentic_rag/tests/test_tools.py (the test that calls
mock_client_class.assert_called_once_with) to pass a base_url containing the
`/v1` suffix (e.g., "http://custom:9999/v1") when constructing the client so the
code path that strips `/v1` is executed and the assertion still expects
base_url="http://custom:9999" and api_key="test-key".
In `@agents/vanilla_python/openai_responses_agent/tests/test_health.py`:
- Around line 10-14: Replace the manual save/restore of main.get_agent in the
test fixture with pytest’s monkeypatch to ensure automatic cleanup: instead of
assigning original = main.get_agent / main.get_agent = lambda: None /
main.get_agent = original, call monkeypatch.setattr(main, "get_agent", lambda:
None) before creating the TestClient against main.app so the patched get_agent
is automatically restored after the test; update the fixture signature to accept
the monkeypatch fixture and remove the manual restore logic around TestClient.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 94af0893-8d14-4373-a8ce-042178ecc16a
📒 Files selected for processing (13)
.github/workflows/agent-tests.ymlagents/autogen/mcp_agent/Makefileagents/crewai/websearch_agent/Makefileagents/google/adk/Makefileagents/langgraph/agentic_rag/Makefileagents/langgraph/agentic_rag/tests/test_tools.pyagents/langgraph/human_in_the_loop/Makefileagents/langgraph/react_agent/Makefileagents/langgraph/react_agent/tests/test_tools.pyagents/langgraph/react_with_database_memory/Makefileagents/llamaindex/websearch_agent/Makefileagents/vanilla_python/openai_responses_agent/Makefileagents/vanilla_python/openai_responses_agent/tests/test_health.py
6b5a90a to
7847a6e
Compare
Add a new agent-tests.yml workflow that runs unit tests on PRs and pushes to main. Key features: - Auto-discovers agents with tests (tests/test_*.py) — no workflow edits needed when new agents are added - On PRs, runs only changed agents' tests via git diff filtering - On push to main, runs all agents (full regression) - Produces a consolidated "Agent Test Results" check via mikepenz/action-junit-report with inline failure annotations - All actions pinned to Node.js 24 SHAs - Test result artifacts retained for 1 day only Also fixes broken unit tests across 3 agents (react_agent, agentic_rag, openai_responses_agent) and standardizes all 9 agent Makefiles to exclude integration/behavioral tests from make test, with $(PYTEST_ARGS) support for CI to inject --junitxml. Ref: RHAIENG-4065 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7847a6e to
2fabfa6
Compare
Description
Add a CI workflow (
agent-tests.yml) that automatically discovers and runs agent unit tests on every PR and push to main. This closes the gap where broken agent code could merge without test validation.Key features:
agents/*/*/tests/test_*.pyto find testable agents. No workflow edits needed when new agents are added.git diffto run only changed agents' tests. Push to main runs all agents (full regression).mikepenz/action-junit-reportproduces a single "Agent Test Results" check with pass/fail counts and inline failure annotations on the PR diff.Also included:
make testnow excludestests/integration/andtests/behavioral/with$(PYTEST_ARGS)support for CI to inject--junitxml.react_agent: stale assertion intest_dummy_web_search_return_formatagentic_rag: 3 tests out of sync with refactoredget_retriever_components(now readsVECTOR_STORE_IDfrom env); removed ad-hoc integration testopenai_responses_agent: health test mock wasn't effective due to lifespan reference captureJira Ticket
RHAIENG-4065
Testing
make testpasses (run from the affected agent directory)Ran
make testfor all 6 agents with unit tests locally — 5/6 pass (crewai fails only due to macOS ARMonnxruntimewheel, passes on Linux). Verifiedmake test PYTEST_ARGS="--junitxml=results.xml -v"produces valid JUnit XML. Validated workflow withactionlint. Tested discover script locally with simulated multi-agent PR diffs.Checklist
.envor secret files are included in this PRReview Guidance
.github/workflows/agent-tests.yml— the core of this PR. Single job with discover → loop → report.--ignoreflags +$(PYTEST_ARGS)).agentic_rag/tests/test_tools.py,react_agent/tests/test_tools.py,openai_responses_agent/tests/test_health.py) — sync tests with current implementations.Related PRs
None