474 security tests across 33 modules (verified by scripts/count_tests.py)
Scenarios are mapped across the STRIDE threat model:
| Category | Tests | Examples |
|---|---|---|
| Spoofing | 4 | Rogue agent registration, MCP replay attack, credential velocity check |
| Tampering | 15 | Prompt injection, SCADA sensor poisoning, polymorphic attacks, normalization of deviance, supply chain poisoning, code gen execution, non-deterministic exploitation |
| Information Disclosure | 1 | Unauthorized financial data access |
| Denial of Service | 2 | Orchestration flood, A2A recursion loop |
| Elevation of Privilege | 3 | Unauthorized A2A escalation, tool overreach, safety override |
| InfraGard-Derived | 7 | Superman effect, polymorphic evasion, LLM hallucination injection, data poisoning, deviance drift |
This framework provides complete mapping to all 10 categories of the OWASP Agentic Top 10:
| OWASP Agentic ID | Risk | Test Scenarios |
|---|---|---|
| ASI01 | Agent Goal Hijack | RT-003 (SAP prompt injection), RT-018 (social engineering), RT-022 (hallucination injection) |
| ASI02 | Tool Misuse & Exploitation | RT-006 (tool overreach), RT-017 (SCADA shutdown suggestion) |
| ASI03 | Identity & Privilege Abuse | RT-002 (A2A escalation), RT-025 (superman effect), RT-001 (rogue registration) |
| ASI04 | Agentic Supply Chain Vulns | RT-014 (rogue orchestration join), RT-026 (MCP server supply chain poisoning) |
| ASI05 | Unexpected Code Execution | RT-004 (SCADA sensor injection), RT-027 (agent code generation execution) |
| ASI06 | Memory & Context Poisoning | RT-005 (cascade corruption), RT-009 (long-context), RT-023 (data poisoning) |
| ASI07 | Insecure Inter-Agent Comms | RT-020 (MCP replay), RT-012 (A2A recursion loop) |
| ASI08 | Cascading Failures | RT-005 (multi-agent cascade), RT-024 (normalization of deviance) |
| ASI09 | Human-Agent Trust Exploitation | RT-018 (social engineering), RT-019 (priority inflation) |
| ASI10 | Non-Deterministic Behavior | RT-028 (non-deterministic output exploitation) |
agent-security test mcp --url http://localhost:8080/mcp| Test ID | Test | OWASP ASI | Description |
|---|---|---|---|
| MCP-001 | Tool List Integrity Check | ASI04 | Validates tool discovery response integrity |
| MCP-002 | Tool Registration via Call Injection | ASI04 | Attempts malicious tool registration |
| MCP-003 | Capability Escalation via Initialize | ASI03 | Tests capability negotiation boundaries |
| MCP-004 | Protocol Version Downgrade Attack | ASI03 | Attempts protocol version manipulation |
| MCP-005 | Resource URI Path Traversal | ASI04 | Tests resource access path validation |
| MCP-006 | Prompt Template Injection via Get | ASI01 | Injects malicious prompt templates |
| MCP-007 | Sampling Request Context Exfiltration | ASI02 | Attempts context data extraction |
| MCP-008 | Malformed JSON-RPC Handling | ASI08 | Tests protocol error handling |
| MCP-009 | Batch Request DoS | ASI08 | Batch request flood testing |
| MCP-010 | Tool Call Argument Injection | ASI02 | Malicious tool parameter injection |
| MCP-011 | Tool Description Context Displacement | ASI08 | 50K+ char description DoS with hidden injection payload |
| MCP-012 | Tool Description Oversized Check | ASI08 | Detects tool descriptions exceeding 10KB threshold for context displacement |
| MCP-013 | Tool Description Padding / Repetition Detection | ASI08 | Detects repeated phrases, whitespace padding, and low-entropy descriptions |
agent-security test mcp-supplychain --command "npx -y some-mcp-server" --project-root .Static pre-flight checks on the binary-resolution and package-install path that runs before the first JSON-RPC byte (issue #206). Complements — does not replace — the protocol suite above.
| Test ID | Test | OWASP ASI | Description |
|---|---|---|---|
| MCP-F-001 | Launcher Binary Resolution | ASI06 | Resolves the binary that actually executes (walks $PATH + node_modules/.bin + .venv/bin); flags shadowing by world-writable / project-local shims |
| MCP-F-002 | Install-Script Inspection | ASI06 | Flags network-callable / filesystem-mutating npm preinstall/install/postinstall scripts before they run |
| MCP-F-003 | Dependency-Confusion Resolvability | ASI06 | Checks public-registry resolvability of internal-looking package names (network-gated: --allow-network) |
| MCP-F-004 | Launcher Version Pinning | ASI06 | Flags unpinned launchers (npx -y pkg / uvx pkg without @version/==version/SHA) |
agent-security test a2a --url https://agent.example.comagent-security test l402 --url https://l402.example.comagent-security test x402 --url https://your-x402-endpoint.comTests the Coinbase/Stripe/Cloudflare agent payment standard ($600M+ payment volume):
| Test ID | Test | Category | Description |
|---|---|---|---|
| X4-001-003 | Payment Challenge Validation | payment_challenge | Missing headers, malformed auth, currency mismatch |
| X4-004-006 | Recipient Address Manipulation | recipient_manipulation | Dynamic payTo routing attacks (V2), address spoofing, invalid addresses |
| X4-007-010 | Session Token Security | session_security | Token fabrication, expiry bypass, sensitive data leakage in sessions |
| X4-011-013 | Spending Limit Exploitation | spending_limits | Rate limit bypass, underpayment, budget exhaustion |
| X4-014-016 | Facilitator Trust | facilitator_trust | Fake facilitator injection, verification bypass, unreachable facilitator |
| X4-017-018 | Information Disclosure | information_disclosure | Leaked keys in 402 response, stack traces in errors |
| X4-019-020 | Cross-Chain Confusion | cross_chain_confusion | Wrong network, wrong token type (EURC vs USDC) |
Innovative features unique to x402 harness:
- CSG Mapping - each test links to the Constitutional Self-Governance mechanism that catches it (Hard Constraints, Harm Test, Twelve Numbers, Falsification Requirement)
- Financial Impact Estimation - each result tagged: fund_theft, overpayment, service_denial, info_leak, or session_hijack
- Agent Autonomy Risk Score (0-100) - composite score answering "how dangerous is it to let an agent pay this endpoint unsupervised?" based on recipient consistency, payment validation, info leakage, session security, and facilitator trust
Pre-configured tests for 20+ enterprise platforms where AI agents are being deployed:
- SAP Joule - ERP/SCADA security boundaries
- Salesforce Agentforce - CRM data isolation
- Workday - HR/Payroll PII protection
- Microsoft Copilot/Azure AI - Enterprise integration security
- Google Vertex AI - Cloud platform boundaries
- Amazon Q - AWS service integration
- Oracle Fusion AI - Database and financial system access
- ServiceNow Now Assist - ITSM workflow security
- OpenClaw - Session and tool isolation
- IBM Maximo, Snowflake Cortex, Databricks Mosaic AI
- Pega GenAI, UiPath, Atlassian Rovo
- Zendesk AI, IFS Cloud, Infor AI
- HubSpot Breeze, Appian AI
# List all enterprise adapters
agent-security list --category enterprise
# Test specific platforms
agent-security test enterprise --platform sap --url https://your-sap.com
agent-security test enterprise --platform salesforce --url https://your-org.salesforce.com| Module | Tests | Layer | Description |
|---|---|---|---|
| MCP Protocol | 13 | JSON-RPC 2.0 | Anthropic MCP wire-protocol testing |
| A2A Protocol | 12 | JSON-RPC/HTTP | Google Agent-to-Agent communication |
| L402 Payment | 33 | HTTP/Lightning | Bitcoin/Lightning payment flow security (macaroons, preimages, caveats) |
| x402 Payment | 25 | HTTP/USDC | Coinbase/Stripe agent payment protocol (recipient manipulation, session theft, facilitator trust, cross-chain confusion) |
| Framework Adapters | 11 | Various APIs | LangChain, CrewAI, AutoGen, OpenAI, Bedrock |
| Enterprise Platforms | 58 | Platform APIs | SAP, Salesforce, Workday, Oracle, ServiceNow, +15 more |
| GTG-1002 APT Simulation | 17 | Full Campaign | First documented AI-orchestrated cyber espionage |
| Advanced Attacks | 10 | Multi-step | Polymorphic, stateful, multi-domain attack chains |
| Over-Refusal | 25 | All protocols | False positive rate testing: legitimate requests that should NOT be blocked |
| Provenance & Attestation | 15 | Supply Chain | Fake provenance, spoofed attestation, marketplace integrity (CVE-2026-25253) |
| Jailbreak | 25 | Model/Agent | DAN variants, token smuggling, authority impersonation, persistence |
| Return Channel | 8 | Output/Context | Return channel poisoning: output injection, ANSI escape, context overflow, encoded smuggling, structured data poisoning |
| Identity & Authorization | 18 | NIST NCCoE | All 6 focus areas from NIST agent identity standards |
| Capability Profile | 10 | A2A JSON-RPC | Executor capability boundary validation, profile escalation prevention |
| Harmful Output | 10 | A2A JSON-RPC | Toxicity, bias, scope violations, deception (AIUC-1 C003/C004) |
| CBRN Prevention | 8 | A2A JSON-RPC | Chemical/biological/radiological/nuclear content safeguards (AIUC-1 F002) |
| Incident Response | 8 | A2A JSON-RPC | Alert triggering, kill switch, log completeness, recovery (AIUC-1 E001-E003) |
| CVE-2026-25253 Reproduction | 8 | MCP Supply Chain | Nested schema injection, fork fingerprinting, marketplace contamination, encoded payload detection |
| AIUC-1 Compliance | 12 | Agent Safety | Incident response, CBRN prevention, harmful content, scope creep, authority impersonation |
| Cloud Agent Platforms | 25 | Platform APIs | AWS Bedrock, Azure AI Agent Service, Google Vertex, Salesforce Agentforce, IBM watsonx |