Trust scoring as an evaluation metric — source tier and provenance

## Context

DeepEval evaluates LLM outputs on faithfulness, relevance, hallucination, etc. A related dimension that's missing: **trust scoring** — how trustworthy is the output based on what sources it used?

Two responses can score identically on faithfulness but have very different trust profiles:
- Response A sourced from SEC filings (high trust)
- Response B sourced from unverified blog posts (low trust)

## Possible metric

A `TrustScoreMetric` that evaluates:
- **Source tier** — were the retrieval sources authoritative (T1-T2) or unverified (T4-T5)?
- **Provenance completeness** — does the output carry metadata about its origin?
- **Verification status** — was the output human-reviewed?

```python
from deepeval.metrics import TrustScoreMetric

metric = TrustScoreMetric(
    threshold=0.7,
    source_tiers={"SEC filings": 1, "news": 3, "forums": 4, "AI inference": 5}
)
test_case = LLMTestCase(
    input="What was Q3 revenue?",
    actual_output="Revenue was $4.2B",
    retrieval_context=["SEC 10-Q filing: Revenue $4.2B"]
)
metric.measure(test_case)
# trust_score: 0.95 (T1 source)
```

## Why this matters

1. EU AI Act Article 50 (August 2, 2026) — compliance requires trust transparency
2. Enterprise RAG systems need to differentiate high-trust vs low-trust outputs
3. Extends DeepEval's coverage from quality metrics to trust metrics

## Reference

- [AKF](https://akf.dev) defines source tiers (T1-T5) and trust computation

Would the team consider trust scoring as an evaluation dimension?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trust scoring as an evaluation metric — source tier and provenance #2586

Context

Possible metric

Why this matters

Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trust scoring as an evaluation metric — source tier and provenance #2586

Description

Context

Possible metric

Why this matters

Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions