RHAIENG-5096: extract shared load_golden() from duplicated conftest files by andrewdonheiser · Pull Request #107 · red-hat-data-services/agentic-starter-kits

andrewdonheiser · 2026-05-19T19:15:02Z

Summary

Extracts the duplicated load_golden() helper (load golden_queries.yaml, optionally filter by category) from 5 behavioral test files into a shared evals/harness/fixtures.py module
Each consumer keeps a thin 2-line wrapper that binds fixtures_dir and preserves the existing zero-arg call signature — no test logic changes
Updates docs/adding-behavioral-tests.md with the new pattern

Files changed

File	Change
`evals/harness/fixtures.py`	New — shared `load_golden(fixtures_dir, category)`
`agents/autogen/mcp_agent/tests/behavioral/conftest.py`	Use shared loader
`agents/crewai/websearch_agent/tests/behavioral/conftest.py`	Use shared loader
`agents/langgraph/agentic_rag/tests/behavioral/conftest.py`	Use shared loader
`agents/langgraph/react_agent/tests/behavioral/test_tool_usage.py`	Use shared loader
`agents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.py`	Use shared loader
`docs/adding-behavioral-tests.md`	Document the new pattern

Test plan

Import resolution: from harness.fixtures import load_golden resolves via pythonpath = ["evals"]
load_golden is NOT re-exported from harness.__init__ (consumers import from harness.fixtures directly)
pytest --collect-only passes for all 5 affected agents (77 tests collected, 0 errors)
Cross-agent consistency check: 10/10 checks pass (import alias, FIXTURES_DIR naming, wrapper signature/body, no leftover inline implementations, no stale imports, doc accuracy)
No remaining golden_queries.yaml references in agent Python files (grep returns 0 matches)

🤖 Generated with Claude Code

coderabbitai · 2026-05-19T19:15:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: dbe4de0f-060a-4595-8157-7d37be318757

📥 Commits

Reviewing files that changed from the base of the PR and between a4767e8 and 3ed17bb.

📒 Files selected for processing (7)

agents/autogen/mcp_agent/tests/behavioral/conftest.py
agents/crewai/websearch_agent/tests/behavioral/conftest.py
agents/langgraph/agentic_rag/tests/behavioral/conftest.py
agents/langgraph/react_agent/tests/behavioral/test_tool_usage.py
agents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.py
docs/adding-behavioral-tests.md
evals/harness/fixtures.py

✅ Files skipped from review due to trivial changes (1)

docs/adding-behavioral-tests.md

📝 Walkthrough

Walkthrough

Adds a shared golden-query loader (evals/harness/fixtures.py), updates multiple agent behavioral test modules to delegate their load_golden() to it (binding a local FIXTURES_DIR), and documents the wrapper pattern.

Changes

Golden Test Query Fixture Consolidation

Layer / File(s)	Summary
Shared fixture utility `evals/harness/fixtures.py`	New module exports `load_golden(fixtures_dir, category)` that reads `golden_queries.yaml`, extracts `queries`, and optionally filters by `category`.
Agent behavioral test migrations `agents/autogen/mcp_agent/tests/behavioral/conftest.py`, `agents/crewai/websearch_agent/tests/behavioral/conftest.py`, `agents/langgraph/agentic_rag/tests/behavioral/conftest.py`, `agents/langgraph/react_agent/tests/behavioral/test_tool_usage.py`, `agents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.py`	These test modules import the shared loader (aliased locally), define `FIXTURES_DIR` pointing at their `fixtures/` folder, and replace inline YAML-reading/filtering with a thin wrapper that calls the shared loader with optional `category`.
Documentation update `docs/adding-behavioral-tests.md`	Adds a `load_golden()` wrapper example that binds `FIXTURES_DIR` and delegates to `harness.fixtures.load_golden`.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main objective: extracting a duplicated load_golden() helper from multiple conftest files into a shared module, which matches the changeset perfectly.
Description check	✅ Passed	The description accurately explains the refactoring effort, lists all affected files, details the test plan, and confirms consistency checks—directly related to and supporting the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/adding-behavioral-tests.md`:
- Around line 43-50: The snippet uses Path and Any but doesn't import them; add
the missing imports for pathlib.Path and typing.Any at the top of the snippet so
the example is copy/paste-safe. Update the top of the snippet before the
existing from harness.fixtures import load_golden as _load_golden_from line to
import Path and Any, leaving FIXTURES_DIR, load_golden, and _load_golden_from
unchanged.

In `@evals/harness/fixtures.py`:
- Around line 2-5: Reorder and group the imports to satisfy Ruff I001: keep
"from __future__ import annotations" first, then the standard library imports in
alphabetical order ("from pathlib import Path" then "from typing import Any"),
then a blank line, then third-party imports ("import yaml"); ensure blank lines
separate the future, stdlib, and third-party groups and that the exact import
statements ("from __future__ import annotations", "from pathlib import Path",
"from typing import Any", "import yaml") are used as shown.
- Around line 14-15: yaml.safe_load(f) can return None or a non-mapping which
makes the subsequent data.get("queries", []) call raise; update the load logic
(the variable data from yaml.safe_load) to ensure it's a mapping before calling
get (e.g., if not isinstance(data, dict): set data = {} or directly set queries
= []), then derive queries from data.get("queries", []) so that queries is
always a list and does not blow up when the YAML root is empty/scalar.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 40a75545-352b-471c-b7bc-2cf62da88c66

📥 Commits

Reviewing files that changed from the base of the PR and between 7fb3ca4 and a4767e8.

📒 Files selected for processing (7)

agents/autogen/mcp_agent/tests/behavioral/conftest.py
agents/crewai/websearch_agent/tests/behavioral/conftest.py
agents/langgraph/agentic_rag/tests/behavioral/conftest.py
agents/langgraph/react_agent/tests/behavioral/test_tool_usage.py
agents/vanilla_python/openai_responses_agent/tests/behavioral/conftest.py
docs/adding-behavioral-tests.md
evals/harness/fixtures.py

The load_golden() helper was copy-pasted identically across 5 agent test files. Extract it into a shared module with an explicit fixtures_dir parameter; each consumer keeps a thin 2-line wrapper that preserves the existing zero-arg call signature, so no test files need changes. Closes RHAIENG-5096 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

andrewdonheiser requested a review from a team as a code owner May 19, 2026 19:15

andrewdonheiser requested review from sanafayyaz315 and tarun-etikala May 19, 2026 19:15

github-actions Bot added area/langgraph area/crewai area/autogen area/vanilla-python area/docs size/s labels May 19, 2026

andrewdonheiser force-pushed the RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files branch from a4767e8 to 968453f Compare May 19, 2026 19:17

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread docs/adding-behavioral-tests.md

Comment thread evals/harness/fixtures.py

Comment thread evals/harness/fixtures.py Outdated

andrewdonheiser force-pushed the RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files branch 2 times, most recently from b1f78f8 to b955476 Compare May 19, 2026 19:21

tarun-etikala reviewed May 20, 2026

View reviewed changes

Comment thread evals/harness/fixtures.py

tarun-etikala previously approved these changes May 20, 2026

View reviewed changes

andrewdonheiser dismissed tarun-etikala’s stale review via 3ed17bb May 20, 2026 13:50

andrewdonheiser force-pushed the RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files branch from b955476 to 3ed17bb Compare May 20, 2026 13:50

tarun-etikala approved these changes May 20, 2026

View reviewed changes

andrewdonheiser merged commit 33cc35f into main May 20, 2026
8 checks passed

andrewdonheiser deleted the RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files branch May 20, 2026 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHAIENG-5096: extract shared load_golden() from duplicated conftest files#107

RHAIENG-5096: extract shared load_golden() from duplicated conftest files#107
andrewdonheiser merged 1 commit into
mainfrom
RHAIENG-5096-extract-shared-load-golden-utility-from-duplicated-conftest-files

andrewdonheiser commented May 19, 2026

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrewdonheiser commented May 19, 2026

Summary

Files changed

Test plan

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 19, 2026 •

edited

Loading