grail: Verifiable Post-Training for LLMs

   ✦  ✦  ✦  ✦  ✦  ✦  ✦
  ┌─┐┬─┐┌─┐┬┬  
  │ ┬├┬┘├─┤││  
  └─┘┴└─┴ ┴┴┴─┘
  ✦  ✦  ✦  ✦  ✦  ✦  ✦

Documentation: Miner • Validator •

Overview

grail delivers post-training for language models with cryptographically verifiable inference. It implements the GRAIL protocol (Guaranteed Rollout Authenticity via Inference Ledger) so that rollouts produced during RL are tied to a specific model and input, and can be independently verified by validators.

Key Distinction: grail vs GRAIL

grail (lowercase): The Bittensor subnet implementation orchestrating miners and validators for verifiable post-training
GRAIL (uppercase): The protocol that proves rollout authenticity and model identity

Current Status

The current release is inference-only: miners generate rollouts and validators verify and score them.
Reinforcement learning post-training (e.g., GRPO trainer and model updates) will be added in a future version.

Architecture

Core Components

1. GRAIL Protocol (`grail/grail.py`)

Prover/Verifier implementation with:

PRF-based index derivation and sketch commitments for token-level verification
Verifier-supplied challenge (drand + chain/window context)
Token and model-config validation; structured signatures bound to model identity
SAT problem binding and solution checks for end-to-end rollout verification

2. Rollout Generation (`grail/mining/rollout_generator.py`, `grail/environments/sat.py`)

GRPO-style rollout system with:

Multiple rollouts per problem, token-level logprob tracking, advantage computation
Qwen-style chat template injection for reasoning/solution tagging
SAT-specific SATRolloutGenerator with modular reward vector composition

3. Environment System (`grail/environments/`)

Modular environments, currently:

SAT Problems (sat.py): Deterministic 3-SAT generation, parsing, reward shaping

4. Communication & Storage (`grail/infrastructure/comms.py`)

Object-storage utilities for miner/validator coordination:

Upload mined rollouts (sink_window_inferences), publish validated rollouts (upload_valid_rollouts)

5. Randomness & Chain

Randomness (grail/infrastructure/drand.py): Robust drand v2-first client with fallbacks and a mock beacon for testing
Chain & credentials (grail/infrastructure/chain.py): Manages R2 credential commitments and metagraph access

6. CLI (`grail/cli/`)

Typer-based CLI with subcommands: mine, validate (and experimental train).

Best practices for miners:

Leave the final 2 blocks of each window for upload; generation should stop near the end automatically.
Prefer uv sync for reproducible installs.

How It Works

Post-Training Flow

Problem Generation: Validators derive a SAT instance from a public seed that mixes drand randomness with the window’s block hash
Rollout Collection: Miners generate multiple GRPO rollouts, tracking token ids and logprobs for proof construction
GRAIL Verification: Validators verify tokens, the GRAIL commitment/opening against the claimed model, the deterministic SAT instance, and the reported solution
Reward & Weights: Validators score miners over recent windows using unique/valid/successful rollout metrics with a superlinear curve, then normalize and set weights on-chain
Model Updates (planned): Validated rollouts will be used for post-training in a future release

Verifiable Inference

The GRAIL protocol ensures:

Deterministic, publicly auditable challenges (drand + chain context)
Model-binding proof of token processing; no substitution or replay
Deterministic SAT instance reconstruction and solution verification

Technical Details

Protocol & Config (from `grail/shared/constants.py`)

PRIME_Q: 2,147,483,647 (mod prime for sketches)
CHALLENGE_K: 16 (minimum challenged positions)
WINDOW_LENGTH: 50 blocks per scoring window

Supported Environments

3-SAT: Variables 3–10, Clauses 5–20, Clause length 3; deterministic from seed
GSM8K: Math word problems from the GSM8K dataset with step-by-step reasoning verification

Model Requirements

Hugging Face Transformers compatible, exposes token ids/logprobs
OS and hardware-agnostic: Runs on any platform with floating point precision within tolerance
Accelerators (GPU/TPU) recommended for throughput

For detailed hardware specifications, see compute.min.yaml.

Setup

For detailed setup instructions, please refer to the appropriate documentation:

Mining Setup

See Miner Documentation for comprehensive setup instructions including:

Hardware and environment requirements
Wallet and network configuration
R2/S3 credentials setup
Dependency installation
Running the miner

Validation Setup

See Validator Documentation for comprehensive setup instructions including:

Hardware and environment requirements
Wallet and network configuration
Dependency installation
Running the validator

Quick Start

# Install dependencies
uv sync

# Run miner
grail mine

# Run validator
grail validate

Important Notes:

Randomness is fetched from drand; miners mix it with the window's block hash
Rollouts are uploaded to object storage (R2/S3); validators fetch, verify, score, and set weights
For monitoring:
- Miners and validators can log detailed metrics to the public W&B project: https://wandb.ai/tplr/grail
- Real-time system logs and network statistics are available at the Grafana dashboard: https://grail-grafana.tplr.ai/

Architecture Benefits

Verifiable Training: Cryptographic binding of rollouts to model and input
Decentralized Post-Training: Internet-scale contribution and evaluation
Problem Agnostic: Environment framework enables new domains beyond SAT
Incentive Aligned: On-chain weights reward sustained, verifiable improvements

Contributing

We welcome contributions to:

New environments and reward vectors
Protocol robustness and verification
Performance and throughput improvements
Documentation and examples

Name		Name	Last commit message	Last commit date
Latest commit History 765 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
grail		grail
research		research
scripts		scripts
tests		tests
tools/vllm-server		tools/vllm-server
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
compute.min.yaml		compute.min.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

grail: Verifiable Post-Training for LLMs

Overview

Key Distinction: grail vs GRAIL

Current Status

Architecture

Core Components

1. GRAIL Protocol (`grail/grail.py`)

2. Rollout Generation (`grail/mining/rollout_generator.py`, `grail/environments/sat.py`)

3. Environment System (`grail/environments/`)

4. Communication & Storage (`grail/infrastructure/comms.py`)

5. Randomness & Chain

6. CLI (`grail/cli/`)

How It Works

Post-Training Flow

Verifiable Inference

Technical Details

Protocol & Config (from `grail/shared/constants.py`)

Supported Environments

Model Requirements

Setup

Mining Setup

Validation Setup

Quick Start

Architecture Benefits

Contributing

About

Uh oh!

Releases 43

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

one-covenant/grail

Folders and files

Latest commit

History

Repository files navigation

grail: Verifiable Post-Training for LLMs

Overview

Key Distinction: grail vs GRAIL

Current Status

Architecture

Core Components

1. GRAIL Protocol (grail/grail.py)

2. Rollout Generation (grail/mining/rollout_generator.py, grail/environments/sat.py)

3. Environment System (grail/environments/)

4. Communication & Storage (grail/infrastructure/comms.py)

5. Randomness & Chain

6. CLI (grail/cli/)

How It Works

Post-Training Flow

Verifiable Inference

Technical Details

Protocol & Config (from grail/shared/constants.py)

Supported Environments

Model Requirements

Setup

Mining Setup

Validation Setup

Quick Start

Architecture Benefits

Contributing

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 43

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

1. GRAIL Protocol (`grail/grail.py`)

2. Rollout Generation (`grail/mining/rollout_generator.py`, `grail/environments/sat.py`)

3. Environment System (`grail/environments/`)

4. Communication & Storage (`grail/infrastructure/comms.py`)

6. CLI (`grail/cli/`)

Protocol & Config (from `grail/shared/constants.py`)

Packages