forensic-provenance-protocol/PROTOCOL.md at main · iamfurydammit/forensic-provenance-protocol

# PROTOCOL.md

## Forensic Provenance Protocol (FPP)

## 1. Purpose

The Forensic Provenance Protocol (FPP) defines a deterministic, graph-based model for reconstructing the **transactional provenance** of digital assets across arbitrary depth, both backward and forward in time.

The protocol is designed for **forensic analysis**, **risk assessment**, and **probabilistic attribution**, not for identity resolution or surveillance.

FPP answers questions of the form:

* *Where did these funds originate?*

* *How did value propagate through the network?*

* *What transformations or obfuscations occurred along the way?*

* *With what confidence can flows be linked across breaks?*

## 2. Core Abstractions

### 2.1 Ledger Event

A **Ledger Event** is any on-chain occurrence that can change the state, ownership, or representation of value.

Examples:

* Native transfers

* Token transfers

* Contract calls

* Mint / burn events

* Bridge lock / mint pairs

* Mixer entry / exit events

Ledger Events are **observed facts**, not interpreted actions.

### 2.2 Provenance Graph

FPP models all activity as a **directed acyclic graph (DAG)**:

* **Vertices (Nodes)** represent *events or states*

* **Edges** represent *value flow or transformation*

* **Direction** is always **forward in time**

Cycles are disallowed at the protocol level.

If a ledger appears cyclic, it is represented as a linearized sequence of state transitions.

## 3. Node Types

Every node in the provenance graph has a **type**, **timestamp**, and **ledger reference**.

### 3.1 Transaction Node

Represents a concrete on-chain transaction.

Properties:

* Transaction hash

* Block height / slot

* Timestamp

* Input addresses

* Output addresses

* Asset identifiers

* Amounts

### 3.2 State Node

Represents a derived or intermediate asset state.

Examples:

* Post-mix output pool

* Bridge custody state

* Contract escrow state

* Aggregated UTXO state

State nodes may not map 1:1 to a transaction.

### 3.3 Synthetic Node

Represents a **logical grouping** introduced by the protocol for analysis.

Examples:

* Mixer anonymity set

* Exchange hot wallet cluster

* Batch settlement window

* Known protocol pool

Synthetic nodes are **explicitly marked** and never treated as on-chain facts.

## 4. Edge Semantics

Edges represent **value propagation**.

Each edge has:

* Source node

* Destination node

* Asset type

* Amount or fraction

* Confidence weight ∈ (0, 1]

### 4.1 Deterministic Edge

Used when value flow is provably exact.

Examples:

* Direct transfer

* Single-input / single-output transaction

* Verified bridge mint corresponding to a lock

Confidence = 1.0

### 4.2 Probabilistic Edge

Used when exact flow cannot be determined.

Examples:

* Mixers

* CoinJoin-style transactions

* Exchange internal rebalancing

* Obfuscated smart contracts

Confidence < 1.0

## 5. Provenance Traversal

### 5.1 Backward Traversal (Ancestry)

Given a target node *N*, backward traversal reconstructs:

* All predecessor nodes

* All inbound edges

* Up to a user-defined depth *d*

Traversal **never halts early** unless:

* Genesis is reached

* Asset mint is reached

* Confidence drops below threshold ε

### 5.2 Forward Traversal (Descendants)

Given a source node *N*, forward traversal reconstructs:

* All successor nodes

* All outbound edges

* Potential future dispersal paths

Used for:

* Risk propagation

* Exposure analysis

* Contamination modeling

## 6. Confidence Propagation

Confidence is multiplicative along paths.

For a path P with edges e₁ … eₙ:


confidence(P) = Π confidence(eᵢ)

Multiple paths between the same nodes are **not collapsed** automatically.

Aggregation strategies are explicitly defined outside the core protocol.

## 7. Depth Control

Traversal depth is **explicit and user-controlled**.

* Depth = number of edges

* No implicit limits

* Supports depth = 1 … N (unbounded)

This enables:

* Shallow inspection

* Deep forensic reconstruction

* Supercomputer-scale traversal

## 8. Obfuscation Awareness (Non-breaking)

Obfuscation mechanisms are **modeled**, not defeated.

The protocol does not claim:

* De-anonymization

* Identity recovery

* Guaranteed attribution

Instead, it provides:

* Explicit uncertainty

* Confidence decay

* Branch explosion where appropriate

## 9. Determinism Guarantee

Given:

* Same ledger data

* Same protocol version

* Same traversal parameters

FPP **must** produce:

* Identical graphs

* Identical confidence values

* Identical branching structure

No randomness is permitted at the protocol layer.

## 10. Out of Scope (Explicit)

FPP does NOT define:

* Data ingestion

* Indexing strategies

* Storage engines

* UI / visualization

* Identity mapping

* Legal conclusions

These are **implementation-layer concerns**.

## 11. Versioning

This document defines **FPP v0.1 (unfrozen)**.

Changes require:

* Explicit revision notes

* Invariant review

* Threat model reassessment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

PROTOCOL.md

Latest commit

History

PROTOCOL.md

File metadata and controls