Skip to content

refactor: provider-agnostic LLMJudge with auto-detection for OpenAI (#1103)#1113

Merged
TimothyZhang7 merged 2 commits intoaden-hive:mainfrom
TanujaNair03:refactor/llm-judge-agnostic
Jan 27, 2026
Merged

refactor: provider-agnostic LLMJudge with auto-detection for OpenAI (#1103)#1113
TimothyZhang7 merged 2 commits intoaden-hive:mainfrom
TanujaNair03:refactor/llm-judge-agnostic

Conversation

@TanujaNair03
Copy link
Contributor

Description

This PR refactors the LLMJudge to be truly provider-agnostic by implementing automatic environment-based detection. While #477 added the ability to manually inject a provider, the judge still defaulted to a hardcoded Anthropic implementation that would fail for users without an Anthropic key or the anthropic package. This PR introduces a "Smart Default" system that prioritizes OPENAI_API_KEY when available, creating a seamless experience for non-Anthropic users without requiring manual code changes to existing tests.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

Related Issues

Fixes #1103

Changes Made

  • Automatic Provider Detection: Implemented _get_fallback_provider() to automatically detect and prioritize OPENAI_API_KEY followed by ANTHROPIC_API_KEY.
  • Backward Compatibility: Maintained the legacy _get_client() method and implemented logic to detect mocked clients, ensuring 100% compatibility with existing unit tests.
  • Robust JSON Parsing: Added a centralized _parse_json_result helper that robustly handles markdown code blocks and "chatter" from various LLM providers.
  • Dependency Optimization: Lazy-loaded the anthropic client so the package is no longer a strict requirement for users running only OpenAI models.

Testing

  • Unit tests pass (cd core && pytest tests/test_llm_judge.py)
  • Lint passes (cd core && ruff check framework/testing/llm_judge.py)
  • Manual testing performed (Verified on Python 3.14.2 with OpenAI keys)

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@github-actions
Copy link

PR Closed - Requirements Not Met

This PR has been automatically closed because it doesn't meet the requirements.

PR Author: @TanujaNair03
Found issues: #1103 (assignees: none), #477 (assignees: pradyten)
Problem: The PR author must be assigned to the linked issue.

To fix:

  1. Assign yourself (@TanujaNair03) to one of the linked issues
  2. Re-open this PR

Why is this required? See #472 for details.

@github-actions github-actions bot closed this Jan 27, 2026
@TimothyZhang7 TimothyZhang7 reopened this Jan 27, 2026
Copy link
Collaborator

@TimothyZhang7 TimothyZhang7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@TimothyZhang7 TimothyZhang7 merged commit e1bea18 into aden-hive:main Jan 27, 2026
6 of 8 checks passed
jhalak999 pushed a commit to jhalak999/hive that referenced this pull request Feb 17, 2026
…e-agnostic

refactor: provider-agnostic LLMJudge with auto-detection for OpenAI (aden-hive#1103)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: LLM Judge is hardcoded to Anthropic and lacks provider-agnostic defaulting

2 participants