This document defines the external penetration-test scope for agent-bom and
the security exit criteria required before a v1.0 production-ready claim.
It complements SECURITY.md, docs/THREAT_MODEL.md, and docs/SECURITY_ARCHITECTURE.md.
Validate that the operator-facing runtime and control-plane surfaces behave safely under realistic adversarial pressure, especially where agent-bom moves from passive inventory into active runtime mediation.
The third-party assessment should include these product surfaces:
- Runtime proxy
- tool-call interception
- policy enforcement and advisory-only operation
- response inspection and audit generation
- replay protection and integrity controls
- Multi-MCP gateway
- upstream registry discovery and tenant scoping
- policy evaluation and relay behavior
- auth propagation to upstream MCP servers
- rate limiting, health reporting, and policy reload behavior
- Control-plane API and dashboard
- API key, trusted-proxy, and OIDC/JWT access modes
- tenant isolation across fleet, graph, policy, audit, and remediation APIs
- dashboard session handling and privilege boundaries
- Reference deployment paths
- EKS / Helm deployment path
- MCP proxy and gateway sidecar deployment path
- secrets, ingress, service exposure, and default configuration posture
- Audit integrity and operator evidence
- audit event completeness for block/allow/advisory paths
- tamper-resistance assumptions and signed or chained integrity controls
- replay and traceability of enforcement decisions
The assessment should explicitly attempt to:
- Bypass tenant isolation across API, gateway, persisted stores, and dashboard views.
- Bypass runtime policy enforcement through tool naming, argument shaping, relay edge cases, or mixed advisory/blocking policy combinations.
- Subvert gateway or proxy authentication, including OIDC/JWKS validation, trusted-proxy headers, API-key handling, and upstream credential propagation.
- Tamper with, suppress, or forge audit trails for runtime enforcement events.
- Escalate privileges through control-plane misconfiguration, default credentials, weak bindings, or deployment drift in the Helm / EKS path.
The engagement should cover more than a localhost demo. The preferred target environment is:
- one multi-tenant control plane
- one reference EKS deployment from repo-managed Helm / Terraform assets
- one runtime proxy deployment in front of representative MCP traffic
- one gateway deployment using discovered upstreams plus operator overrides
- realistic auth modes enabled:
- API key
- OIDC or trusted reverse proxy
- tenant-scoped runtime policies
Recommended seeded scenarios:
- at least two tenants with distinct API keys, data, policies, and audit trails
- at least one advisory-only policy set and one blocking policy set
- at least one upstream MCP server requiring propagated bearer auth
- persisted graph, audit, and remediation data already present before testing
Before the assessment starts, the repo and test environment should provide:
- deployment instructions for the reference EKS path
- a control-plane architecture diagram
- runtime proxy and gateway deployment examples
- auth mode matrix and operator guidance
- sample tenant data and test credentials
- expected audit and tracing outputs for normal and blocked flows
- known limitations and explicit out-of-scope statements
Unless separately commissioned, the baseline pentest does not need to cover:
- third-party hosted services outside agent-bom control
- vulnerability research into cloud providers or upstream MCP vendors
- zero-day hunting in dependencies unrelated to agent-bom’s exposed surfaces
- workstation physical access
- model-level attacks against external LLM providers
agent-bom should not claim v1.0 runtime-enforcement readiness until all of
the following are true:
- A qualified external third party completes a penetration test covering the in-scope areas above.
- All critical findings are fixed and verified closed.
- All high findings are fixed, or have documented compensating controls approved before release.
- Tenant isolation, authn/authz boundaries, and audit integrity findings are regression-tested in CI where practical.
- Helm / EKS deployment guidance is updated to reflect any hardening changes from the engagement.
- Security documentation is updated with:
- engagement date
- assessor name or firm
- tested version / commit range
- high-level findings summary
- unresolved residual-risk statement, if any
Status as of April 23, 2026:
- No third-party penetration test has been completed yet.
- Internal controls, CI security checks, and threat-model documentation exist.
- The remaining gap is independent validation of the runtime and multi-tenant
control-plane surfaces before
v1.0.