Skip to content

feat: Add Agent QA Pipeline template #6053

Open
Samir-atra wants to merge 1 commit intoaden-hive:mainfrom
Samir-atra:feat/agent-qa-pipeline-4286
Open

feat: Add Agent QA Pipeline template #6053
Samir-atra wants to merge 1 commit intoaden-hive:mainfrom
Samir-atra:feat/agent-qa-pipeline-4286

Conversation

@Samir-atra
Copy link

Description

This PR implements the Agent QA Pipeline template as proposed in Issue #4286 - a goal-driven agent that performs quality assessment on other Hive agents.

What it does

The Agent QA Pipeline performs comprehensive quality assessment on other Hive agents through:

  • Static Analysis: Topology, patterns, edge consistency, node quality
  • Functional Testing: Spec-level correctness validation
  • Resilience Testing: Error handling and recovery patterns
  • Security Auditing: OWASP LLM Top 10 checks

The pipeline produces a PASS / CONDITIONAL / FAIL verdict with a score (0-100) and letter grade (A-F), along with actionable fix suggestions and iterative re-test cycles.

Graph Architecture

intake → load-agent → static-analysis → generate-test-plan → review-test-plan (HITL pause)
  → [fan-out: run-functional | run-resilience | run-security]
  → [fan-in: aggregate-results]
  → judge-quality
    → PASS/FAIL → generate-report → deliver-report
    → CONDITIONAL → request-fixes → load-agent (feedback cycle, max 3x)

Framework Features Demonstrated

This template is the first to demonstrate these features (all at 0% template coverage before this PR):

Feature Implementation Notes
Fan-out / fan-in 3 parallel test runners → aggregate-results Demonstrates parallel execution
on_failure edge load-agent → generate-report Graceful error handling
HITL pause_nodes review-test-plan User approves test plan
Conditional routing judge-quality → PASS/FAIL vs CONDITIONAL Multi-path routing
Feedback loop with max_node_visits request-fixes → load-agent (max 3) Prevents infinite loops
nullable_output_keys load_errors, test_preferences, fix_suggestions Optional outputs

Type of Change

  • New feature (non-breaking change that adds functionality)

Related Issues

Resolves #4286

Changes Made

  • examples/templates/agent_qa_pipeline/agent.py: Main agent class with 13 nodes, 17 edges
  • examples/templates/agent_qa_pipeline/agent.json: Declarative agent spec
  • examples/templates/agent_qa_pipeline/config.py: Configuration class
  • examples/templates/agent_qa_pipeline/init.py: Module exports
  • examples/templates/agent_qa_pipeline/main.py: Entry point
  • examples/templates/agent_qa_pipeline/nodes/init.py: All 13 node definitions
  • examples/templates/agent_qa_pipeline/tests/test_structure.py: 27 comprehensive tests
  • examples/templates/agent_qa_pipeline/tests/conftest.py: Pytest configuration

Testing

All 27 tests pass:

cd core && uv run pytest ../examples/templates/agent_qa_pipeline/tests/ -v
# Output: 27 passed in 4.31s
  • Unit tests pass
  • Graph validation passes (0 errors, 0 warnings)
  • Manual testing performed

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • My changes generate no new warnings
  • I have added tests that prove my feature works
  • New and existing unit tests pass locally with my changes

Implementation Notes

Phase 1 (Works Today)

  • Static analysis + spec-level LLM reasoning for all 3 test categories
  • All graph patterns work with current framework
  • Useful as-is for structural validation

Future Enhancements (Framework Proposals)

  • Proposal 1: Sub-Graph Execution Node - for actual runtime testing
  • Proposal 2: Tool Interception/Mocking - for resilience testing
  • Proposal 3: Execution Snapshot & Comparison - for regression testing

Each phase is independently valuable and can be implemented incrementally.

This commit adds the Agent QA Pipeline template which which performs
quality assessment on other Hive agents through:
- Static analysis (topology, patterns, edge consistency)
- Functional testing (spec-level reasoning)
- Resilience testing (error handling patterns)
- Security auditing (OWASP LLM Top 10)

The agent demonstrates the following framework features:
(at 0% template coverage before this PR):

- Fan-out/fan-in pattern: 3 parallel test runners
- ON_FAILURE edge: load-agent -> generate-report
- HITL pause_nodes: review-test-plan
- Conditional routing: judge -> PASS/FAIL vs CONDITIONAL
- Feedback loop with max_node_visits: request-fixes -> load-agent (max 3)
- nullable_output_keys: load_errors, test_preferences, fix_suggestions

Phase 1 (works today):
- Static analysis + spec-level LLM reasoning for all 3 test categories
- Demonstrates all graph patterns
- Useful as-is for structural validation

Phase 2 (needs Proposal 1 - sub-graph execution)
- Runtime testing of target agents

Phase 3 (needs Proposals 2+3)
- Tool interception for resilience testing
- Snapshot comparison for regression testing

Files:
- examples/templates/agent_qa_pipeline/agent.py - Main agent class
- examples/templates/agent_qa_pipeline/agent.json - Declarative agent spec
- examples/templates/agent_qa_pipeline/config.py - Configuration
- examples/templates/agent_qa_pipeline/nodes/__init__.py - Node definitions
- examples/templates/agent_qa_pipeline/tests/test_structure.py - Structure tests

Resolves aden-hive#4286
@github-actions
Copy link

github-actions bot commented Mar 9, 2026

PR Requirements Warning

This PR does not meet the contribution requirements.
If the issue is not fixed within ~24 hours, it may be automatically closed.

PR Author: @Samir-atra
Found issues: #4286 (assignees: none)
Problem: The PR author must be assigned to the linked issue.

To fix:

  1. Assign yourself (@Samir-atra) to one of the linked issues
  2. Re-open this PR

Exception: To bypass this requirement, you can:

  • Add the micro-fix label or include micro-fix in your PR title for trivial fixes
  • Add the documentation label or include doc/docs in your PR title for documentation changes

Micro-fix requirements (must meet ALL):

Qualifies Disqualifies
< 20 lines changed Any functional bug fix
Typos & Documentation & Linting Refactoring for "clean code"
No logic/API/DB changes New features (even tiny ones)

Why is this required? See #472 for details.

@github-actions github-actions bot added the pr-requirements-warning PR doesn't follow contribution guidelines. Please fix or it will be auto-closed. label Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-requirements-warning PR doesn't follow contribution guidelines. Please fix or it will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[agent-idea] Agent QA Pipeline — Meta-Circular Testing with Framework Evolution Proposals

1 participant