RHAIENG-5096: extract shared load_golden() from duplicated conftest files#107
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (7)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds a shared golden-query loader (evals/harness/fixtures.py), updates multiple agent behavioral test modules to delegate their load_golden() to it (binding a local FIXTURES_DIR), and documents the wrapper pattern. ChangesGolden Test Query Fixture Consolidation
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
a4767e8 to
968453f
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/adding-behavioral-tests.md`:
- Around line 43-50: The snippet uses Path and Any but doesn't import them; add
the missing imports for pathlib.Path and typing.Any at the top of the snippet so
the example is copy/paste-safe. Update the top of the snippet before the
existing from harness.fixtures import load_golden as _load_golden_from line to
import Path and Any, leaving FIXTURES_DIR, load_golden, and _load_golden_from
unchanged.
In `@evals/harness/fixtures.py`:
- Around line 2-5: Reorder and group the imports to satisfy Ruff I001: keep
"from __future__ import annotations" first, then the standard library imports in
alphabetical order ("from pathlib import Path" then "from typing import Any"),
then a blank line, then third-party imports ("import yaml"); ensure blank lines
separate the future, stdlib, and third-party groups and that the exact import
statements ("from __future__ import annotations", "from pathlib import Path",
"from typing import Any", "import yaml") are used as shown.
- Around line 14-15: yaml.safe_load(f) can return None or a non-mapping which
makes the subsequent data.get("queries", []) call raise; update the load logic
(the variable data from yaml.safe_load) to ensure it's a mapping before calling
get (e.g., if not isinstance(data, dict): set data = {} or directly set queries
= []), then derive queries from data.get("queries", []) so that queries is
always a list and does not blow up when the YAML root is empty/scalar.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 40a75545-352b-471c-b7bc-2cf62da88c66
📒 Files selected for processing (7)
agents/autogen/mcp_agent/tests/behavioral/conftest.pyagents/crewai/websearch_agent/tests/behavioral/conftest.pyagents/langgraph/agentic_rag/tests/behavioral/conftest.pyagents/langgraph/react_agent/tests/behavioral/test_tool_usage.pyagents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.pydocs/adding-behavioral-tests.mdevals/harness/fixtures.py
b1f78f8 to
b955476
Compare
The load_golden() helper was copy-pasted identically across 5 agent test files. Extract it into a shared module with an explicit fixtures_dir parameter; each consumer keeps a thin 2-line wrapper that preserves the existing zero-arg call signature, so no test files need changes. Closes RHAIENG-5096 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b955476 to
3ed17bb
Compare
Summary
load_golden()helper (loadgolden_queries.yaml, optionally filter by category) from 5 behavioral test files into a sharedevals/harness/fixtures.pymodulefixtures_dirand preserves the existing zero-arg call signature — no test logic changesdocs/adding-behavioral-tests.mdwith the new patternFiles changed
evals/harness/fixtures.pyload_golden(fixtures_dir, category)agents/autogen/mcp_agent/tests/behavioral/conftest.pyagents/crewai/websearch_agent/tests/behavioral/conftest.pyagents/langgraph/agentic_rag/tests/behavioral/conftest.pyagents/langgraph/react_agent/tests/behavioral/test_tool_usage.pyagents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.pydocs/adding-behavioral-tests.mdTest plan
from harness.fixtures import load_goldenresolves viapythonpath = ["evals"]load_goldenis NOT re-exported fromharness.__init__(consumers import fromharness.fixturesdirectly)pytest --collect-onlypasses for all 5 affected agents (77 tests collected, 0 errors)golden_queries.yamlreferences in agent Python files (grepreturns 0 matches)🤖 Generated with Claude Code