mcp-name: io.github.Ratnaditya-J/spineframe
Audit-grade provenance for AI agent workflows.
SpineFrame is an execution and provenance runtime for AI pipelines in regulated environments. Every tool call, every piece of evidence, and every LLM decision is traced, policy-checked, and signed. The output is not just a result; it is a verifiable evidence chain that survives audit.
Website: spineframe.xyz | Spec: SPEC.md | Changelog: CHANGELOG.md | PyPI: spineframe
AI agents are making consequential decisions — gathering compliance evidence, analyzing infrastructure, synthesizing research for regulated industries. The outputs matter. And right now, there is no way to answer basic audit questions:
- What tool calls produced this evidence? Which MCP server, which parameters, when?
- Was every tool invocation policy-checked? Can you prove it?
- Can you trace a finding back to the source document? Through normalization, through mapping, to the original retrieval?
- Who approved the export? Is the approval signed and tamper-evident?
SpineFrame makes these questions answerable by default, not as a reporting afterthought.
pip install spineframe[signing]
python examples/compliance_demo.py
This runs a full SOC2 audit evidence workflow against sample data:
- Parse the compliance target (framework, scope, controls)
- Gather evidence from local files or MCP tool sources — every retrieval logged with tool call ID, params hash, and policy decision
- Normalize into a standard schema — each item gets a new
evidence_idwithupstream_evidence_idpreserving the lineage - Map evidence to SOC2/ISO 27001 controls
- Analyze gaps — missing controls, weak evidence, coverage metrics
- Export a signed compliance package with full provenance chain
The output is two artifacts:
compliance_package.json— control mappings, findings, coverageevidence.json— clean evidence index with tool-call lineage, content hashes, and explicit inline/reference markers
For the full approval, diff, and signing flow, run:
python examples/compliance_e2e_scenario.py --verbose
Then fork the same evidence base for a different framework:
spineframe fork <run-id> --from-stage target_parse \
--query "ISO 27001 audit for Acme Corp"
Same evidence, different control mapping. Seconds instead of hours.
These are not optional features. They are on by default.
Tool Call Provenance — Every tool invocation records: tool call ID (tc_ prefixed ULID), fully qualified tool name, parameter hash, request/completion timestamps, raw result hash, and the policy decision that authorized it. The schema is identical whether the call came from an LLM tool-use loop or a deterministic pipeline.
Evidence Lineage — Evidence items carry prefixed ULIDs (ev_) through every stage. When normalization re-mints an ID, the original is preserved as upstream_evidence_id. When export omits large content, the entry carries content_inlined: false and content_ref: "artifact". The chain is: tool_call_id → raw evidence → upstream_evidence_id → normalized evidence → export.
Tool Policy Enforcement — Declarative per-stage tool allowlists evaluated before every invocation. Policy checks run in both the LLM tool-use loop and direct MCP paths. Denied calls raise immediately and are logged.
Signed Provenance — Ed25519 signatures on provenance bundles covering the full merkle hash tree. Generate keys with spineframe keygen, sign at export, verify externally or with spineframe verify --signature.
Artifact Isolation — Stages declare their inputs and outputs. Undeclared reads or writes fail at runtime. This is enforced by default; disable with --unsafe for development only.
Tamper-Evident Audit Log — Every action (auth, access, tool calls, approvals, exports) is recorded in a hash-chained JSONL log. Each entry includes the SHA-256 of the previous entry. Tampering or deletion is detectable.
Domain Assertions — Machine-checkable rules at stage boundaries: evidence completeness thresholds, minimum source counts, blocked domains, confidence floors, export approval gates.
Claims vs. Reality Verification — A dedicated verify_claims stage extracts factual claims from agent output, matches each against the signed provenance record, and classifies them as verified, unverified, contradicted, or fabricated. The verification report is itself a signed artifact — meta-provenance. Add it to any pipeline to catch hallucinated tool calls, inflated result counts, or claims that don't trace to any source.
The evidence.json export is designed for reviewers and downstream compliance systems:
{
"schema_version": "1",
"run_id": "20260315_143022_abc123",
"project": "acme-soc2",
"generated_at": "2026-03-15T14:32:18Z",
"evidence": [
{
"evidence_id": "ev_01JXYZ...",
"evidence_type": "policy_document",
"title": "Access Control Policy",
"content_hash": "sha256:5f912c...",
"content_inlined": true,
"source": {
"system": "github",
"object_id": "github://acme/policies/main/soc2/access-control.md"
},
"provenance": {
"tool_call_id": "tc_01JXYZ...",
"tool_fqn": "github:get_file_contents",
"upstream_evidence_id": "ev_01JXYW..."
},
"content": "# Access Control Policy\n..."
}
]
}SpineFrame ships with project-scoped API key authentication and role-based access control.
Roles: admin > operator > auditor > viewer. Every non-health endpoint requires authentication when API keys are configured.
Project scoping: API keys can be scoped to a project. Scoped keys only see runs, artifacts, audit entries, and WebSocket events matching their project.
Audit isolation: Scoped auditors see only their project's audit log entries. WebSocket broadcasts are filtered by project — no cross-project event leakage.
pip install spineframe
Optional dependencies:
pip install spineframe[signing] # Ed25519 provenance signing
pip install spineframe[mcp] # MCP tool providers
pip install spineframe[anthropic] # Claude models
pip install spineframe[openai] # OpenAI models
pip install spineframe[web] # Web UI (FastAPI + React)
pip install spineframe[all] # Everything
SpineFrame is a DAG execution engine. The assurance features above are built on general-purpose execution primitives that apply to any pipeline.
Checkpoint and Resume — Every stage writes immutable checkpoints. Resume compares SHA-256 input hashes to skip unchanged stages. A 50-stage pipeline that fails at stage 47 resumes from 47.
Fork with Instructions — Clone a completed run and diverge from any stage. Upstream artifacts are reused. One evidence base, multiple analyses:
spineframe fork <run-id> --from-stage synthesize_report \
--instructions "Focus on manufacturer liability"
Replay — Re-execute from any stage downstream while preserving upstream artifacts.
Dynamic Replanning — Stages can inject additional work into the DAG at runtime based on coverage signals.
Expandable Sub-stages — Stages that process large datasets split into plan/batch/collect phases with configurable concurrency.
SpineFrame ships three production pipelines. The compliance pipeline is the primary product surface; research and OSINT are reference implementations that demonstrate the engine's generality.
Compliance (6 stages) — Target parsing, evidence gathering (local files or MCP tool sources), normalization, control mapping (SOC2, ISO 27001), gap analysis, signed export packaging. Every tool call is governed and every evidence item is traceable.
Research (11 stages) — Web research with full provenance: planning, iterative search with coverage-driven refinement, page fetch, extraction, source verification, evidence mapping, synthesis, and report generation. Every claim traces to a source URL.
OSINT (10 stages) — Domain reconnaissance using 13 stdlib-only source clients. DNS, RDAP, certificate transparency, subdomain enumeration, infrastructure analysis, web archive, technology fingerprinting, identity recon, correlation, and reporting. Zero API keys required.
spineframe web --config providers.json
Run management, live WebSocket progress, pipeline selector, plan review, artifact browser with cost/timing data, run comparison with semantic diff, provenance chain viewer, and smart fork suggestions (LLM-recommended fork points from natural language).
spineframe run --query "SOC2 Type II audit for Acme Corp" \
--pipeline compliance --config providers.json
spineframe show <run-id>
spineframe inspect <run-id>
spineframe verify <run-id>
spineframe verify <run-id> --signature
spineframe export --run-id <run-id> --format html
spineframe fork <run-id> --from-stage <stage> --instructions "..."
spineframe resume <run-id>
spineframe replay <run-id> --from-stage <stage>
spineframe diff <run-id-a> <run-id-b>
spineframe verify-claims <run-id>
spineframe keygen
spineframe web
SpineFrame is also an MCP server. Any MCP client (Claude Desktop, Cursor, custom agents) can invoke SpineFrame as a tool — run audits, check status, retrieve reports, and verify provenance without touching the CLI or web UI.
Published in the MCP ecosystem as io.github.Ratnaditya-J/spineframe.
pip install spineframe[mcp]
spineframe-mcp --runs-dir ./runs --config providers.json
Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"spineframe": {
"command": "spineframe-mcp",
"args": ["--runs-dir", "./runs", "--config", "providers.json"]
}
}
}Available tools:
| Tool | Description |
|---|---|
run_compliance_audit |
SOC 2 / ISO 27001 evidence audit with full provenance |
run_osint_recon |
Domain reconnaissance (13 sources, no API keys) |
run_research |
Web research with source tracing |
get_run_status |
Check run progress and stage completion |
list_runs |
List recent runs, filter by pipeline |
get_report |
Retrieve final report as markdown |
get_artifacts |
List or retrieve run artifacts |
get_provenance |
Full provenance chain (tool calls → evidence → controls) |
fork_run |
Fork from any stage and re-run downstream |
verify_run |
Verify artifact hashes, event log, and signatures |
SpineFrame both consumes MCP servers (GitHub, filesystem, etc. as evidence sources) and is one. Your agents can orchestrate SpineFrame as a tool with the same provenance guarantees.
from spineframe.engine import Engine, register_stage, Stage
from spineframe.models import StageNode, GraphDefinition, StageResult, StageStatus
@register_stage("my_gather")
class MyGatherStage(Stage):
def execute(self, context):
# context.call_tool() is policy-checked and provenance-logged
result = context.call_tool("github", "get_file_contents", {
"repo": "acme/policies", "path": "access-control.md"
})
context.write_artifact("evidence.json", {
"content": result.content,
"tool_call_id": "...",
})
return StageResult(
status=StageStatus.SUCCESS,
output_artifacts={"evidence.json": "artifacts/evidence.json"},
)
graph = GraphDefinition(stages=[
StageNode(stage_id="gather", stage_type="my_gather",
output_artifacts=["evidence.json"]),
StageNode(stage_id="analyze", stage_type="my_analyze",
depends_on=["gather"],
input_artifacts=["evidence.json"],
output_artifacts=["analysis.json"]),
])src/spineframe/
engine.py DAG executor: run, resume, replay, fork, assurance hooks
models.py StageNode, GraphDefinition, RunManifest
ids.py Prefixed ULIDs (ev_, tc_) for evidence and tool calls
artifacts.py Content-addressed storage with SHA-256 hashes
checkpoints.py Immutable run snapshots
events.py Append-only JSONL event stream
signing.py Ed25519 signing and verification
tool_policy.py Declarative tool policy engine
domain_policy.py Domain assertion engine
mcp_server.py MCP server (stdio + SSE) — 10 tools
verification/ Claims vs. Reality verification engine
stages/
compliance/ 6-stage compliance pipeline
research.py 11-stage research pipeline
osint/ 10-stage OSINT pipeline
tools.py Tool invocation stages (direct + LLM tool-use loop)
providers/
factory.py Provider registry with lazy imports
mcp_tool.py MCP tool provider (stdio + SSE)
anthropic.py, openai.py, tavily_search.py, serper.py
pipelines/ DAG definitions
web/ FastAPI backend + React frontend
62 registered stage types. 930 tests. Apache 2.0.
Apache 2.0. See LICENSE.