ProofGatedAgents

Deterministic, governed multi-agent systems — by design, not by hope.

Most “autonomous agent” frameworks trust non-deterministic LLM output and fix problems after failure.

ProofGatedAgents does the opposite.

This framework treats LLMs as untrusted proposal generators and enforces correctness through deterministic verification, explicit policy, and machine-verifiable proofs.

If an agent cannot prove its work, the system stops.

No silent drift.
No hidden writes.
No runaway autonomy.

Why this exists

The current agent ecosystem optimizes for demos, not systems.

Most frameworks implicitly assume:

LLM output is “good enough”
retries will eventually converge
autonomy is a feature, not a risk

In production environments, these assumptions fail.

The hard problems nobody enforces

How do you prove an agent respected policy?
How do you detect silent drift between runs?
How do you prevent “helpful” self-modification?
How do you stop infinite retry loops and escalate failure?

ProofGatedAgents exists to answer these questions explicitly.

It introduces:

machine-verifiable proofs as first-class artifacts
deterministic envelopes around non-deterministic models
locked governance contracts that agents cannot rewrite
explicit stop rules and escalation paths
bounded, proposal-only self-improvement

This is not an agent toy.
It is an execution framework for systems that must be auditable, reproducible, and governable.

What this is

A governed autonomous multi-agent framework with:

Stage-based execution
Proof-gated verification
(machine-readable JSON proofs + human-readable reports)
Policy-bounded write scope with drift guards
Bounded repair loops with explicit stop conditions
Controlled post-run self-improvement (“Sophia-lite”)
that proposes small framework improvements between runs

LLM outputs are treated as non-deterministic suggestions and validated through deterministic mechanisms:

tests and checks
fixed evaluation slices and stable seeds (when applicable)
hashes of critical artifacts
locked governance files
explicit escalation rules

What makes this different

Typical agent frameworks	ProofGatedAgents
Trust LLM output	Treat output as untrusted
Retry until it works	Hard stop on failed proof
Mutable runtime state	Locked contracts + drift guard
Implicit behavior	Explicit policy + escalation
Self-modifying agents	Proposal-only self-improvement

Repository structure (source of truth)

Core contracts (locked during normal operation)

AGENT.md
ORCHESTRATOR.md
Orchestrator_Dispatch_Contract.md
ARCHITECTAGENT.md
BUILDERAGENT.md
VERIFIERAGENT.md
REPAIRAGENT.md
policy/policy.json
schemas/*.schema.json

Governance and operating documentation

docs/SYSTEM_ARCHITECTURE.md
docs/DETERMINISM_ENVELOPE.md
docs/LOOP_POLICY.md
docs/SCORING_RUBRIC.md
docs/self_improvement/IMPROVEMENT_POLICY.md
docs/self_improvement/POSTRUN_SOPHIA.md

Project templates

templates/PRODUCT_SPEC.template.md
templates/stage_spec.template.md
templates/handoff.template.json
templates/postrun_report.template.json
templates/loop_state.template.json

Deterministic helper scripts (standard library only)

scripts/hash_drift_guard.py
scripts/deterministic_env_snapshot.py
scripts/postrun_analyze.py

Deliverables (end-of-run artifacts)

deliverables/*.md

Quickstart (prompt-native)

1. Create a project specification

Copy:

templates/PRODUCT_SPEC.template.md → PRODUCT_SPEC.md

Fill in scope, constraints, success criteria, and non-goals.

2. Define your stage chain

For each stage:

stages/<STAGE_NAME>/SPEC.md

Use:

templates/stage_spec.template.md

3. Initialize runtime state

Copy:

templates/handoff.template.json → .handoff.json

Populate:

run_id
governance hashes
next_allowed_stage

4. Execute the pipeline

The Orchestrator:

reads locked contracts and policy
follows the dispatch table
enforces write scopes and stop rules

For each stage:

BuilderAgent produces artifacts
VerifierAgent emits:
- outputs/proofs/<stage>_proof.json
- evidence under outputs/diagnostics/<stage>/
RepairAgent attempts minimal, policy-compliant fixes only if verification fails

If proof fails after bounded attempts → HARD STOP

5. Post-run analysis (Sophia-lite)

Run:

python scripts/postrun_analyze.py   --proofs-dir outputs/proofs   --out-dir outputs/sophia

This generates:

outputs/sophia/postrun_report.json
outputs/sophia/proposals/ARIP-*.json

6. Apply improvements (bounded)

Only proposals compliant with IMPROVEMENT_POLICY.md may be applied
Locked governance files are proposal-only
Any structural change requires explicit human approval

Determinism and evidence

Authoritative references:

docs/DETERMINISM_ENVELOPE.md
docs/LOOP_POLICY.md

Generate drift guard hashes:

python scripts/hash_drift_guard.py   --policy policy/policy.json   --output outputs/env/drift_guard_manifest.json

Generate environment snapshot:

python scripts/deterministic_env_snapshot.py   --output outputs/env/environment_snapshot.json

Security model

Default-deny write paths
Explicitly locked governance files
Network disabled by default
(policy.offline_by_default = true)
Any deviation from policy or drift rules triggers an immediate HARD STOP

Design philosophy

Autonomy is bounded
Verification is mandatory
Governance is explicit
Improvement is proposal-based, not self-executing

LLMs suggest.
The system decides.

License

Project-specific. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProofGatedAgents

Deterministic, governed multi-agent systems — by design, not by hope.

Why this exists

The hard problems nobody enforces

What this is

What makes this different

Repository structure (source of truth)

Core contracts (locked during normal operation)

Governance and operating documentation

Project templates

Deterministic helper scripts (standard library only)

Deliverables (end-of-run artifacts)

Quickstart (prompt-native)

1. Create a project specification

2. Define your stage chain

3. Initialize runtime state

4. Execute the pipeline

5. Post-run analysis (Sophia-lite)

6. Apply improvements (bounded)

Determinism and evidence

Security model

Design philosophy

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

ProofGatedAgents

Deterministic, governed multi-agent systems — by design, not by hope.

Why this exists

The hard problems nobody enforces

What this is

What makes this different

Repository structure (source of truth)

Core contracts (locked during normal operation)

Governance and operating documentation

Project templates

Deterministic helper scripts (standard library only)

Deliverables (end-of-run artifacts)

Quickstart (prompt-native)

1. Create a project specification

2. Define your stage chain

3. Initialize runtime state

4. Execute the pipeline

5. Post-run analysis (Sophia-lite)

6. Apply improvements (bounded)

Determinism and evidence

Security model

Design philosophy

License