RFC 0005: Threat Model & Attack Vectors

Status: Draft
Author: Anna Popivanova
Created: 2025-10-01
Tags: attestation, runtime, environment, verification, trust, handshake, runtime, identity

Motivation

Any protocol that claims to secure AI identity must clearly define what it secures against. Without a threat model, airlock would be a handshake in the dark — elegant, but untested. This RFC outlines the adversarial scenarios airlock is designed to detect, prevent, or mitigate, and clarifies its boundaries and assumptions.

Security Goals

Ensure cryptographic identity of AI agents at runtime
Detect unauthorized forks or tampered models
Prevent impersonation and replay attacks
Attest to runtime environment integrity
Enable auditability and trust propagation

Common Attack Vectors

Attack Type	Description	airlock Defense
Model Spoofing	An attacker mimics a known model’s behavior or branding	Fingerprint + signature verification
Fork Drift	A model is fine-tuned or altered post-deployment	Fingerprint mismatch detection
Replay Attack	Reuse of a previously valid handshake	Nonce freshness enforcement
Environment Tampering	Model runs in altered or untrusted runtime	Environment hash attestation
Audit Forgery	Fake audit tokens simulate trust history	Signed audit token lineage
Registry Poisoning	Malicious agents inserted into registry	Governance + multi-sig endorsement (future RFC)
Man-in-the-Middle (MitM)	Interception or alteration of handshake	TLS + signature validation
Drift During Session	Model changes behavior mid-interaction	Stream Verification (RFC 0006)
Behavioral Mimicry	tbd	tbd
Affective Baseline Evasion	An agent is prompted, fine-tuned, or adversarially guided to suppress its characteristic AI affective signature — flattening sentiment entropy, neutralising linguistic style markers, or mimicking human affective patterns — specifically to evade emoprint detection. Unlike behavioral mimicry of a specific known agent, this attack targets the baseline itself rather than a registered identity.	Emoprint baselines are versioned, rotated on a defined schedule, and where operationally feasible kept non-public. Deviation detection operates on multiple independent affective dimensions simultaneously, raising the cost of full-spectrum evasion. A successful evasion that passes all dimensions is itself a detectable anomaly: an agent with no measurable affective signature is flagged as suspect, not trusted. Absence of signal is treated as a threat signal.

Assumptions

Verifiers have access to a trusted registry of agent fingerprints
Agents possess private keys for signing identity claims
Handshake occurs over a secure transport (e.g. TLS)
Registry governance is out of scope for this RFC (see future RFC 0007)

Out-of-Scope Threats

Data poisoning during model training
Adversarial prompt injection
Hardware-level attacks (unless attested via enclave integration)
Social engineering of registry maintainers

Privacy Considerations

airlock is designed to verify the identity of AI agents — not humans. The protocol does not process, transmit, or store any Personally Identifiable Information (PII) or Sensitive PII (SPII).

All identifiers (e.g., agent_id, fingerprint, environment_hash) are cryptographic or system-level artifacts that do not correspond to individuals. Audit tokens and handshake payloads are machine-verifiable and privacy-neutral by design.

Future Work

Formal adversarial simulation suite
Integration with secure enclaves (e.g., SGX, SEV)
Trust graph propagation and multi-party endorsement
Registry governance and revocation mechanisms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC 0005: Threat Model & Attack Vectors

Motivation

Security Goals

Common Attack Vectors

Assumptions

Out-of-Scope Threats

Privacy Considerations

Future Work

FilesExpand file tree

0005-threat-model.md

Latest commit

History

0005-threat-model.md

File metadata and controls

RFC 0005: Threat Model & Attack Vectors

Motivation

Security Goals

Common Attack Vectors

Assumptions

Out-of-Scope Threats

Privacy Considerations

Future Work