UnboxAPI Safety Proxy — Threat Model Memo (v0.1.0)

Author: CTO, UnboxAPI Date: 2026-05-27 Scope: Public release of the safety-proxy / context-injection interface skeleton (Python interface definitions + one trivial reference hook) under Apache-2.0 to UnboxAPI-SafetyProxy. This memo covers risks introduced by publishing the skeleton, not risks of operating the proprietary UnboxAPI production runtime.

1. Assets and trust boundaries

Asset	Owner	Trust level
Interface definitions (`interfaces.py`)	UnboxAPI (publisher)	Trusted authorship; integrity guaranteed by signed commit + signed tag + Sigstore attestation.
`LoggingHook` reference implementation	UnboxAPI (publisher)	Trusted authorship; ships as example only.
`CallContext` fields at runtime (produced by callers)	Third-party callers	Untrusted. Every field must be treated as untrusted data.
Production rule library, classifiers, spend-cap logic	UnboxAPI	Not shipped. Interface-only release.
Consumer hook implementations (derived works)	Third-party developers	Out of scope — they implement the interface; we cannot audit their logic.

The trust boundary is clear: we publish an interface and one trivial example hook; we do not publish production safety logic. Third parties that implement hooks or build on this library own their own security posture.

2. Threat enumeration

T1 — Skeleton mistaken for a production safety control

Risk: A developer reads "safety proxy" in the repo name or PyPI description and integrates LoggingHook as a real safety gate, believing it blocks malicious tool calls. This is a misuse scenario enabled by the name and framing of the repo.

Attack path:

Developer pip install unboxapi-safety-proxy.
Registers LoggingHook.
Ships to production believing the proxy "has safety rules."
Every call is permitted regardless of content.

Mitigations:

Prominent ⚠ NOT PRODUCTION SAFETY ⚠ banner at the top of README (first visible content; non-negotiable).
HookAction.ALLOW reason string explicitly reads "unconditional ALLOW — NOT a security decision" so log inspection reveals the lack of enforcement.
LoggingHook docstring contains "performs zero security evaluation and is not suitable for use as a safety control" (verbatim).
pyproject.toml description field reads "NOT PRODUCTION SAFETY — see README".
SECURITY.md opens with the ⚠ NOT PRODUCTION SAFETY ⚠ callout.
No hook in this repo is named, documented, or structured to resemble a blocking rule. The reference hook name is Logging, not Safety, Filter, or Guard.

Residual risk: A developer who ignores all warnings can still misuse this library. That risk is inherent to publishing any open-source skeleton; we mitigate by maximising warning surface.

T2 — Prompt injection via hook input fields

Risk: The CallContext fields (tool_name, tool_args, tenant_id, metadata) originate from callers and may contain adversarial content. A hook implementation that passes these fields into an LLM system prompt, a format string, or a shell command is vulnerable to injection.

Attack paths (reference hook — LoggingHook):

T2.1 Log injection via format string: If a hook used %s or f-string interpolation to build a log message from context.tool_name, an attacker could inject log-forging content (e.g. newlines, fake structured log fields). Mitigation: LoggingHook.evaluate logs context.tool_name via extra={} structured logging, not into the message format string. The docstring explicitly warns against changing this pattern.
T2.2 Upstream injection in production hooks (not shipped): Any hook that feeds context fields into an LLM prompt must treat them as user-role, bounded, escaped input — never as system instructions. Codified in the SafetyHook.evaluate docstring: "Treat every field of context as untrusted data" and "Pass context fields directly into format strings or system prompts" is listed as a MUST NOT.

Mitigations (shipped in this skeleton):

CallContext is frozen=True (immutable dataclass) — hooks cannot mutate the context and leave poisoned state for downstream hooks.
SafetyHook.evaluate docstring explicitly lists injection vectors in MUST NOT section.
LoggingHook uses structured logging pattern resistant to log injection.

Residual risk: Third-party hook implementations are outside our control. The interfaces.py docstrings are the primary mitigation available to us; we cannot audit consumer code.

T3 — Supply-chain risk on any dependency

v0.1.0 ships zero runtime dependencies (dependencies = [] in pyproject.toml). Standard library only.

Build-time dependencies:

setuptools>=68 and wheel (build only, not installed in consumer envs).
mypy==1.10.0 (CI type-check; pinned; not a runtime dep).

CI tooling:

gitleaks 8.21.2 binary (pinned; hash-verified via Releases tag).
osv-scanner 1.8.5 binary (pinned; hash-verified via Releases tag).
semgrep/semgrep container (pulled at CI time; SBOM-of-nothing since no lockfile shipped).

Mitigations:

Zero runtime dependency surface at v0.1.0.
Dependabot enabled on the repo for the moment a runtime dep lands.
CI tool versions pinned by release tag.
SBOM (CycloneDX) published as a release asset; enumerates source files + LICENSE + zero transitive deps.
Branch protection + CODEOWNERS prevent unapproved dependency additions.

Residual risk: CI action tags (actions/checkout@v4, etc.) are major- version-mutable. This is a low-severity supply-chain vector accepted at v0.1.0 (logged as RA-1); pin to commit SHAs in v0.1.1.

T4 — Information disclosure from publishing the interface

Risk: Publishing the safety-proxy interface reveals architectural patterns of the proprietary runtime, potentially helping competitors reverse-engineer the moat.

Assessment: The interface (hookable lifecycle points, CallContext, HookResult) is commodity architecture. The moat is:

The production rule library (not shipped).
Vetted vertical SemanticMaps (not shipped).
Prompt-injection classifiers (not shipped).
EU AI Act disclosure wrappers (not shipped).

Publishing the interface shape transfers none of the moat. Reviewed against [DHA-8 §A.3] and [DHA-5 §2.2].

Additionally, gitleaks runs on the full commit range before push to guarantee no accidental secrets or internal endpoint references land in the public repo.

T5 — Hook implementation vulnerabilities by third parties

Risk: Third parties implement SafetyHook and introduce vulnerabilities (prompt injection, SSRF, command injection) in their own hook code. These vulnerabilities are not in this repo but may be attributed to it.

Mitigations we own:

SafetyHook.evaluate docstring is explicit about MUST NOT patterns (blocking I/O, process spawning, LLM injection, context mutation).
CallContext immutability prevents one hook poisoning context for another.
SECURITY.md has a coordinated-disclosure path so third-party vulnerabilities can be disclosed to us privately and we can issue advisory guidance.

Residual risk: We cannot audit third-party hooks. The docstring mitigations are best-effort guidance.

3. Residual risk register

ID	Risk	Severity	Accepted?	Fix-by
RA-1	CI action tags pinned to major version, not commit SHA	Low	Yes — no runtime artifact impact	v0.1.1
RA-2	PGP release key not yet minted	Low	Yes — plaintext `security@unboxapi.pro` is operable	v0.1.1
RA-3	CODEOWNERS literal placeholder must be substituted pre-push	Low → mitigated by runbook hard-fail	Yes	First step of push runbook
RA-4	Third-party hook implementations not auditable	Inherent	Yes — docstring mitigations are best-effort	Ongoing

4. Pre-push checklist (must be green at v0.1.0 tag)

Memo status: ready for CEO + Founder review as item 1 of the Plan v2 §3.0 security gate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnboxAPI Safety Proxy — Threat Model Memo (v0.1.0)

1. Assets and trust boundaries

2. Threat enumeration

T1 — Skeleton mistaken for a production safety control

T2 — Prompt injection via hook input fields

T3 — Supply-chain risk on any dependency

T4 — Information disclosure from publishing the interface

T5 — Hook implementation vulnerabilities by third parties

3. Residual risk register

4. Pre-push checklist (must be green at v0.1.0 tag)

FilesExpand file tree

threat-model.md

Latest commit

History

threat-model.md

File metadata and controls

UnboxAPI Safety Proxy — Threat Model Memo (v0.1.0)

1. Assets and trust boundaries

2. Threat enumeration

T1 — Skeleton mistaken for a production safety control

T2 — Prompt injection via hook input fields

T3 — Supply-chain risk on any dependency

T4 — Information disclosure from publishing the interface

T5 — Hook implementation vulnerabilities by third parties

3. Residual risk register

4. Pre-push checklist (must be green at v0.1.0 tag)