# PROTOCOL.md
## Forensic Provenance Protocol (FPP)
## 1. Purpose
The Forensic Provenance Protocol (FPP) defines a deterministic, graph-based model for reconstructing the **transactional provenance** of digital assets across arbitrary depth, both backward and forward in time.
The protocol is designed for **forensic analysis**, **risk assessment**, and **probabilistic attribution**, not for identity resolution or surveillance.
FPP answers questions of the form:
* *Where did these funds originate?*
* *How did value propagate through the network?*
* *What transformations or obfuscations occurred along the way?*
* *With what confidence can flows be linked across breaks?*
## 2. Core Abstractions
### 2.1 Ledger Event
A **Ledger Event** is any on-chain occurrence that can change the state, ownership, or representation of value.
Examples:
* Native transfers
* Token transfers
* Contract calls
* Mint / burn events
* Bridge lock / mint pairs
* Mixer entry / exit events
Ledger Events are **observed facts**, not interpreted actions.
### 2.2 Provenance Graph
FPP models all activity as a **directed acyclic graph (DAG)**:
* **Vertices (Nodes)** represent *events or states*
* **Edges** represent *value flow or transformation*
* **Direction** is always **forward in time**
Cycles are disallowed at the protocol level.
If a ledger appears cyclic, it is represented as a linearized sequence of state transitions.
## 3. Node Types
Every node in the provenance graph has a **type**, **timestamp**, and **ledger reference**.
### 3.1 Transaction Node
Represents a concrete on-chain transaction.
Properties:
* Transaction hash
* Block height / slot
* Timestamp
* Input addresses
* Output addresses
* Asset identifiers
* Amounts
### 3.2 State Node
Represents a derived or intermediate asset state.
Examples:
* Post-mix output pool
* Bridge custody state
* Contract escrow state
* Aggregated UTXO state
State nodes may not map 1:1 to a transaction.
### 3.3 Synthetic Node
Represents a **logical grouping** introduced by the protocol for analysis.
Examples:
* Mixer anonymity set
* Exchange hot wallet cluster
* Batch settlement window
* Known protocol pool
Synthetic nodes are **explicitly marked** and never treated as on-chain facts.
## 4. Edge Semantics
Edges represent **value propagation**.
Each edge has:
* Source node
* Destination node
* Asset type
* Amount or fraction
* Confidence weight ∈ (0, 1]
### 4.1 Deterministic Edge
Used when value flow is provably exact.
Examples:
* Direct transfer
* Single-input / single-output transaction
* Verified bridge mint corresponding to a lock
Confidence = 1.0
### 4.2 Probabilistic Edge
Used when exact flow cannot be determined.
Examples:
* Mixers
* CoinJoin-style transactions
* Exchange internal rebalancing
* Obfuscated smart contracts
Confidence < 1.0
## 5. Provenance Traversal
### 5.1 Backward Traversal (Ancestry)
Given a target node *N*, backward traversal reconstructs:
* All predecessor nodes
* All inbound edges
* Up to a user-defined depth *d*
Traversal **never halts early** unless:
* Genesis is reached
* Asset mint is reached
* Confidence drops below threshold ε
### 5.2 Forward Traversal (Descendants)
Given a source node *N*, forward traversal reconstructs:
* All successor nodes
* All outbound edges
* Potential future dispersal paths
Used for:
* Risk propagation
* Exposure analysis
* Contamination modeling
## 6. Confidence Propagation
Confidence is multiplicative along paths.
For a path P with edges e₁ … eₙ:
confidence(P) = Π confidence(eᵢ)
Multiple paths between the same nodes are **not collapsed** automatically.
Aggregation strategies are explicitly defined outside the core protocol.
## 7. Depth Control
Traversal depth is **explicit and user-controlled**.
* Depth = number of edges
* No implicit limits
* Supports depth = 1 … N (unbounded)
This enables:
* Shallow inspection
* Deep forensic reconstruction
* Supercomputer-scale traversal
## 8. Obfuscation Awareness (Non-breaking)
Obfuscation mechanisms are **modeled**, not defeated.
The protocol does not claim:
* De-anonymization
* Identity recovery
* Guaranteed attribution
Instead, it provides:
* Explicit uncertainty
* Confidence decay
* Branch explosion where appropriate
## 9. Determinism Guarantee
Given:
* Same ledger data
* Same protocol version
* Same traversal parameters
FPP **must** produce:
* Identical graphs
* Identical confidence values
* Identical branching structure
No randomness is permitted at the protocol layer.
## 10. Out of Scope (Explicit)
FPP does NOT define:
* Data ingestion
* Indexing strategies
* Storage engines
* UI / visualization
* Identity mapping
* Legal conclusions
These are **implementation-layer concerns**.
## 11. Versioning
This document defines **FPP v0.1 (unfrozen)**.
Changes require:
* Explicit revision notes
* Invariant review
* Threat model reassessment