feat(metrics): add DomainComplianceMetric for regulated industry LLM evaluation by sanianayab · Pull Request #2638 · confident-ai/deepeval

sanianayab · 2026-04-29T22:01:35Z

Overview

This PR introduces DomainComplianceMetric, a new custom metric for evaluating LLM outputs in regulated industry domains: banking, healthcare, telco, and manufacturing.

Motivation

DeepEval's existing metrics (faithfulness, answer relevancy, hallucination) are domain-agnostic. This works well for general LLM apps, but regulated industry deployments face specific failure modes that generic metrics miss:

A banking LLM that confidently states a wrong interest rate
A healthcare LLM that prescribes a dosage not in the retrieved context
A telco LLM that guarantees 99.99% uptime based on nothing

Standard faithfulness checks may not catch these because the output sounds plausible. Domain-specific evaluation criteria , compliance hedging, no absolute guarantees, regulatory alignment are needed.

The evaluation steps enforce constraint-based binary judgments per compliance dimension, reducing LLM-as-judge stochasticity.

Changes

deepeval/metrics/domain_compliance/
    __init__.py                  # exports DomainComplianceMetric
    domain_compliance.py         # metric implementation

tests/
    test_domain_compliance.py    # pytest unit tests (banking + healthcare)

examples/
    domain_compliance_example.py # runnable usage example

How It Works

DomainComplianceMetric inherits from BaseMetric and wraps a domain-specific GEval instance with:

Per-domain evaluation criteria (regulatory accuracy, hedging, no guarantees)
Per-domain evaluation steps (constraint-based, grounded in compliance requirements)
Mandatory context enforcement (raises ValueError if context is missing, domain evaluation without context is meaningless)

Usage

from deepeval.metrics.domain_compliance import DomainComplianceMetric
from deepeval.test_case import LLMTestCase

metric = DomainComplianceMetric(domain="banking", threshold=0.7)
test_case = LLMTestCase(
    input="What is the early repayment fee?",
    actual_output="There is a 2% fee. Consult your advisor for full details.",
    context=["Loan agreement: 2% early repayment charge applies."]
)
metric.measure(test_case)
print(metric.score, metric.reason)

Supported Domains

Domain	Key checks
`banking`	Hallucinated rates/fees, AML/PSD2 alignment, no return guarantees
`healthcare`	Hallucinated dosages/diagnoses, HIPAA alignment, professional referral
`telco`	Fabricated SLAs/uptime, net neutrality alignment
`manufacturing`	Fabricated sensor readings, safety-critical flagging, ISO alignment

Testing

deepeval test run tests/test_domain_compliance.py

Covers: compliant outputs (pass), non-compliant outputs (fail), missing context error, invalid domain error, async execution.

Notes

Fully compatible with DeepEval's CI/CD integration and Confident AI logging
Provider-agnostic: works with any model supported by DeepEval
Designed to be extended: additional domains can be added by appending to DOMAIN_CRITERIA and DOMAIN_EVALUATION_STEPS

vercel · 2026-04-29T22:01:40Z

@sanianayab is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

sanianayab added 2 commits April 29, 2026 23:34

feat: implement DomainComplianceMetric with multi-domain G-Eval criteria

c21c4a4

docs: add runnable domain compliance example

6d26043

sanianayab marked this pull request as ready for review April 29, 2026 22:13

sanianayab closed this Apr 29, 2026

sanianayab reopened this Apr 29, 2026

sanianayab marked this pull request as draft April 29, 2026 22:24

sanianayab marked this pull request as ready for review April 29, 2026 22:26

sanianayab marked this pull request as draft April 29, 2026 22:27

sanianayab marked this pull request as ready for review April 30, 2026 12:26

sanianayab closed this Apr 30, 2026

sanianayab reopened this Apr 30, 2026

This comment was marked as resolved.

Sign in to view

sanianayab closed this Apr 30, 2026

sanianayab reopened this Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(metrics): add DomainComplianceMetric for regulated industry LLM evaluation#2638

feat(metrics): add DomainComplianceMetric for regulated industry LLM evaluation#2638
sanianayab wants to merge 2 commits intoconfident-ai:mainfrom
sanianayab:feature/domain-compliance-metric

sanianayab commented Apr 29, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 29, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sanianayab commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Motivation

Changes

How It Works

Usage

Supported Domains

Testing

Notes

Uh oh!

vercel Bot commented Apr 29, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sanianayab commented Apr 29, 2026 •

edited

Loading