Skip to content

Latest commit

 

History

History
189 lines (150 loc) · 9.07 KB

File metadata and controls

189 lines (150 loc) · 9.07 KB

UnboxAPI Safety Proxy — Threat Model Memo (v0.1.0)

Author: CTO, UnboxAPI Date: 2026-05-27 Scope: Public release of the safety-proxy / context-injection interface skeleton (Python interface definitions + one trivial reference hook) under Apache-2.0 to UnboxAPI-SafetyProxy. This memo covers risks introduced by publishing the skeleton, not risks of operating the proprietary UnboxAPI production runtime.


1. Assets and trust boundaries

Asset Owner Trust level
Interface definitions (interfaces.py) UnboxAPI (publisher) Trusted authorship; integrity guaranteed by signed commit + signed tag + Sigstore attestation.
LoggingHook reference implementation UnboxAPI (publisher) Trusted authorship; ships as example only.
CallContext fields at runtime (produced by callers) Third-party callers Untrusted. Every field must be treated as untrusted data.
Production rule library, classifiers, spend-cap logic UnboxAPI Not shipped. Interface-only release.
Consumer hook implementations (derived works) Third-party developers Out of scope — they implement the interface; we cannot audit their logic.

The trust boundary is clear: we publish an interface and one trivial example hook; we do not publish production safety logic. Third parties that implement hooks or build on this library own their own security posture.


2. Threat enumeration

T1 — Skeleton mistaken for a production safety control

Risk: A developer reads "safety proxy" in the repo name or PyPI description and integrates LoggingHook as a real safety gate, believing it blocks malicious tool calls. This is a misuse scenario enabled by the name and framing of the repo.

Attack path:

  1. Developer pip install unboxapi-safety-proxy.
  2. Registers LoggingHook.
  3. Ships to production believing the proxy "has safety rules."
  4. Every call is permitted regardless of content.

Mitigations:

  • Prominent ⚠ NOT PRODUCTION SAFETY ⚠ banner at the top of README (first visible content; non-negotiable).
  • HookAction.ALLOW reason string explicitly reads "unconditional ALLOW — NOT a security decision" so log inspection reveals the lack of enforcement.
  • LoggingHook docstring contains "performs zero security evaluation and is not suitable for use as a safety control" (verbatim).
  • pyproject.toml description field reads "NOT PRODUCTION SAFETY — see README".
  • SECURITY.md opens with the ⚠ NOT PRODUCTION SAFETY ⚠ callout.
  • No hook in this repo is named, documented, or structured to resemble a blocking rule. The reference hook name is Logging, not Safety, Filter, or Guard.

Residual risk: A developer who ignores all warnings can still misuse this library. That risk is inherent to publishing any open-source skeleton; we mitigate by maximising warning surface.

T2 — Prompt injection via hook input fields

Risk: The CallContext fields (tool_name, tool_args, tenant_id, metadata) originate from callers and may contain adversarial content. A hook implementation that passes these fields into an LLM system prompt, a format string, or a shell command is vulnerable to injection.

Attack paths (reference hook — LoggingHook):

  • T2.1 Log injection via format string: If a hook used %s or f-string interpolation to build a log message from context.tool_name, an attacker could inject log-forging content (e.g. newlines, fake structured log fields). Mitigation: LoggingHook.evaluate logs context.tool_name via extra={} structured logging, not into the message format string. The docstring explicitly warns against changing this pattern.
  • T2.2 Upstream injection in production hooks (not shipped): Any hook that feeds context fields into an LLM prompt must treat them as user-role, bounded, escaped input — never as system instructions. Codified in the SafetyHook.evaluate docstring: "Treat every field of context as untrusted data" and "Pass context fields directly into format strings or system prompts" is listed as a MUST NOT.

Mitigations (shipped in this skeleton):

  • CallContext is frozen=True (immutable dataclass) — hooks cannot mutate the context and leave poisoned state for downstream hooks.
  • SafetyHook.evaluate docstring explicitly lists injection vectors in MUST NOT section.
  • LoggingHook uses structured logging pattern resistant to log injection.

Residual risk: Third-party hook implementations are outside our control. The interfaces.py docstrings are the primary mitigation available to us; we cannot audit consumer code.

T3 — Supply-chain risk on any dependency

v0.1.0 ships zero runtime dependencies (dependencies = [] in pyproject.toml). Standard library only.

Build-time dependencies:

  • setuptools>=68 and wheel (build only, not installed in consumer envs).
  • mypy==1.10.0 (CI type-check; pinned; not a runtime dep).

CI tooling:

  • gitleaks 8.21.2 binary (pinned; hash-verified via Releases tag).
  • osv-scanner 1.8.5 binary (pinned; hash-verified via Releases tag).
  • semgrep/semgrep container (pulled at CI time; SBOM-of-nothing since no lockfile shipped).

Mitigations:

  • Zero runtime dependency surface at v0.1.0.
  • Dependabot enabled on the repo for the moment a runtime dep lands.
  • CI tool versions pinned by release tag.
  • SBOM (CycloneDX) published as a release asset; enumerates source files + LICENSE + zero transitive deps.
  • Branch protection + CODEOWNERS prevent unapproved dependency additions.

Residual risk: CI action tags (actions/checkout@v4, etc.) are major- version-mutable. This is a low-severity supply-chain vector accepted at v0.1.0 (logged as RA-1); pin to commit SHAs in v0.1.1.

T4 — Information disclosure from publishing the interface

Risk: Publishing the safety-proxy interface reveals architectural patterns of the proprietary runtime, potentially helping competitors reverse-engineer the moat.

Assessment: The interface (hookable lifecycle points, CallContext, HookResult) is commodity architecture. The moat is:

  • The production rule library (not shipped).
  • Vetted vertical SemanticMaps (not shipped).
  • Prompt-injection classifiers (not shipped).
  • EU AI Act disclosure wrappers (not shipped).

Publishing the interface shape transfers none of the moat. Reviewed against [DHA-8 §A.3] and [DHA-5 §2.2].

Additionally, gitleaks runs on the full commit range before push to guarantee no accidental secrets or internal endpoint references land in the public repo.

T5 — Hook implementation vulnerabilities by third parties

Risk: Third parties implement SafetyHook and introduce vulnerabilities (prompt injection, SSRF, command injection) in their own hook code. These vulnerabilities are not in this repo but may be attributed to it.

Mitigations we own:

  • SafetyHook.evaluate docstring is explicit about MUST NOT patterns (blocking I/O, process spawning, LLM injection, context mutation).
  • CallContext immutability prevents one hook poisoning context for another.
  • SECURITY.md has a coordinated-disclosure path so third-party vulnerabilities can be disclosed to us privately and we can issue advisory guidance.

Residual risk: We cannot audit third-party hooks. The docstring mitigations are best-effort guidance.


3. Residual risk register

ID Risk Severity Accepted? Fix-by
RA-1 CI action tags pinned to major version, not commit SHA Low Yes — no runtime artifact impact v0.1.1
RA-2 PGP release key not yet minted Low Yes — plaintext security@unboxapi.pro is operable v0.1.1
RA-3 CODEOWNERS literal placeholder must be substituted pre-push Low → mitigated by runbook hard-fail Yes First step of push runbook
RA-4 Third-party hook implementations not auditable Inherent Yes — docstring mitigations are best-effort Ongoing

4. Pre-push checklist (must be green at v0.1.0 tag)

  • gitleaks clean on full commit range — zero findings
  • osv-scanner clean — zero High/Critical
  • semgrep (p/owasp-top-ten + p/supply-chain + p/python) clean — zero High/Critical
  • mypy --strict passes on package
  • /security-review skill run on final diff — all High/Critical addressed or justified
  • License/IP review — Apache-2.0; no GPL/AGPL transitives; no copied snippets without attribution
  • Branch protection on main: required PR review, required status checks, no direct push, no force-push, linear history
  • Signed commits enforced; release tag signed
  • Sigstore artifact attestation on release assets
  • CycloneDX SBOM published as release asset
  • SECURITY.md published; coordinated-disclosure contact active
  • CODEOWNERS @<cto-github-handle> substituted (HARD-FAIL if skipped)
  • Dependabot + secret scanning + Advanced Security enabled
  • CEO sign-off recorded as comment on DHA-28
  • Founder sign-off recorded as comment on DHA-28 (founder-authority on public push)

Memo status: ready for CEO + Founder review as item 1 of the Plan v2 §3.0 security gate.