Skip to content

Commit eaf493e

Browse files
committed
docs: clarify compares, limitations, boundaries
1 parent cbafa38 commit eaf493e

File tree

1 file changed

+87
-40
lines changed

1 file changed

+87
-40
lines changed

README.md

Lines changed: 87 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Extension of [zkml](https://github.com/uiuc-kang-lab/zkml) for distributed proving using Ray, layer-wise partitioning, and Merkle trees.
44

5-
> **⚠️ Status Note:** This is an experimental research project. For production zkml, consider [zk-torch](https://github.com/uiuc-kang-lab/zk-torch) which uses proof folding for parallelization. See [Status and Limitations](#status-and-limitations) for details.
5+
> **⚠️ Status Note:** This is an experimental research project. Also consider [zk-torch](https://github.com/uiuc-kang-lab/zk-torch).
66
77
## Completed Milestones
88

@@ -11,15 +11,16 @@ Extension of [zkml](https://github.com/uiuc-kang-lab/zkml) for distributed provi
1111
3. ~~**Ray-Rust integration**: Connect Python Ray workers to Rust proof generation ([#9](https://github.com/ray-project/distributed-zkml/issues/9))~~ Done
1212
4. ~~**GPU acceleration**: ICICLE GPU backend for MSM operations ([#10](https://github.com/ray-project/distributed-zkml/issues/10))~~ Done - see [GPU Acceleration](#gpu-acceleration)
1313

14-
**Note**: For production zkML, see [zk-torch](https://github.com/uiuc-kang-lab/zk-torch) or [Status and Limitations](#status-and-limitations).
15-
1614
---
1715

1816
## Table of Contents
1917

2018
- [Status and Limitations](#status-and-limitations)
2119
- [Overview](#overview)
2220
- [Implementation](#implementation)
21+
- [How Distributed Proving Works](#how-distributed-proving-works)
22+
- [Security Model and Trust Boundaries](#security-model-and-trust-boundaries)
23+
- [Structure](#structure)
2324
- [Requirements](#requirements)
2425
- [Quick Start](#quick-start)
2526
- [GPU Acceleration](#gpu-acceleration)
@@ -38,7 +39,10 @@ This project implements a **Ray-based distributed proving approach** for zkml. I
3839

3940
**Proof Composition**: This implementation generates separate proofs per chunk. It does not implement recursive proof composition or aggregation. Verifiers must check O(n) proofs rather than O(1), limiting succinctness.
4041

41-
**Security Assumptions**: The distributed trust model (Ray workers) is not formally analyzed. It does not address malicious worker resistance, collusion resistance, and Byzantine fault tolerance.
42+
**Trust Domain**:
43+
- **Merkle trees provide privacy for proof readers, not compute providers**: The prover must know all weights and activations to generate a valid ZK proof. Merkle trees hide intermediate values from people *reading the published proof*, not from the compute provider *during execution*.
44+
- **Multi-party security requires different trust domains**: Security only applies when chunks are distributed across different trust domains (e.g., your servers + AWS), not just different AWS regions.
45+
- **Comparison to TEE/FHE/MPC**: Trusted Execution Environments (TEEs), Fully Homomorphic Encryption (FHE), or Multi-Party Computation (MPC) provide stronger privacy guarantees but at significant costs that are beyond the threshold of scalable AI applications.
4246

4347
### When to Use This
4448

@@ -47,6 +51,11 @@ This project implements a **Ray-based distributed proving approach** for zkml. I
4751
- Need examples of Ray integration for cryptographic workloads
4852
- Studying Merkle-based privacy for intermediate computations
4953
- Building distributed halo2 proving (not zkML-specific)
54+
- **Use case**: You trust compute providers but want to limit public proof exposure, or model is partitioned across multiple non-colluding organizations
55+
56+
**Use alternatives if:**
57+
- Need to hide data from compute providers themselves → Requires TEEs/FHE/MPC
58+
- Need single aggregated proof → Consider [zk-torch](https://github.com/uiuc-kang-lab/zk-torch)
5059

5160
---
5261

@@ -55,17 +64,17 @@ This project implements a **Ray-based distributed proving approach** for zkml. I
5564
This repository extends zkml (see [ZKML paper](https://ddkang.github.io/papers/2024/zkml-eurosys.pdf)) with distributed proving capabilities. zkml provides an optimizing compiler from TensorFlow to halo2 ZK-SNARK circuits.
5665

5766
distributed-zkml adds:
58-
- **Layer-wise partitioning**: Split ML models into chunks for parallel proving
59-
- **Merkle trees**: Privacy-preserving commitments to intermediate values using Poseidon hashing
60-
- **Ray integration**: Distributed execution across GPU workers
67+
- **Layer-wise partitioning**: Split ML models into chunks for parallel proving across GPUs via Ray
68+
- **Merkle tree commitments**: Hash intermediate activations with Poseidon; only publish root in proof
69+
- **ICICLE GPU acceleration**: Hardware-accelerated MSM operations
6170

6271
### Comparison to zkml
6372

6473
| Feature | zkml | distributed-zkml |
6574
|---------|------|------------------|
6675
| Architecture | Single-machine | Distributed across GPUs |
6776
| Scalability | Single GPU memory | Horizontal scaling |
68-
| Privacy | Outputs public | Intermediate values private via Merkle trees |
77+
| Privacy | Outputs public | Intermediate values hidden from proof readers via Merkle trees |
6978

7079
## Implementation
7180

@@ -76,32 +85,70 @@ distributed-zkml adds:
7685
3. **Merkle Commitments**: Hash intermediate outputs with Poseidon, only root is public
7786
4. **On-Chain**: Publish only the Merkle root (O(1) public values vs O(n) without)
7887

79-
**Note**: Each chunk produces a separate proof. This implementation does not aggregate proofs into a single succinct proof. Verifiers must check all chunk proofs individually (O(n) verification time). For single-proof aggregation, see [zk-orch](hhttps://github.com/uiuc-kang-lab/zk-torch)'s accumulation-based approach.
88+
**Note**: Each chunk produces a separate proof. This implementation does not aggregate proofs into a single succinct proof. Verifiers must check all chunk proofs individually (O(n) verification time). For single-proof aggregation, see [zk-torch](https://github.com/uiuc-kang-lab/zk-torch)'s accumulation-based approach.
8089

81-
\`\`\`
90+
```
8291
Model: 9 layers -> 3 chunks
8392
Chunk 1: Layers 0-2 -> GPU 1 -> Hash A
8493
Chunk 2: Layers 3-5 -> GPU 2 -> Hash B
8594
Chunk 3: Layers 6-8 -> GPU 3 -> Hash C
8695
8796
Merkle Tree:
8897
Root (public)
89-
/ \\
98+
/ \
9099
Hash(AB) Hash C
91-
/ \\
100+
/ \
92101
Hash A Hash B
93-
\`\`\`
102+
```
103+
104+
### Trust Boundaries
105+
106+
#### What Merkle Trees Provide
107+
108+
| Scenario | Hidden? | Explanation |
109+
|----------|---------|-------------|
110+
| Proof readers reconstructing weights via model inversion | Yes | Intermediate activations are hashed, not exposed in proof |
111+
| Compute provider seeing weights during execution | No | Provider must have weights to generate ZK proof |
112+
| Compute provider seeing intermediate activations during execution | No | Provider computes them |
113+
114+
**Key insight:** Merkle trees hide intermediate values from people *reading the published proof*, not from the compute provider *during execution*. The prover must know all values to generate a valid ZK proof.
115+
116+
#### Multi-Party Proving and Trust Domains
117+
118+
Security depends on **trust domains**, not physical location:
119+
120+
| Setup | Trust Domains | What's Private |
121+
|-------|---------------|----------------|
122+
| Single AWS account (any region) | 1 | Nothing from AWS — they control all regions |
123+
| Your servers + AWS | 2 | Your portion's weights never sent to AWS |
124+
| AWS + Google + Azure | 3 | Each provider sees only their chunk (assuming non-collusion) |
125+
126+
**Multi-party benefit:** If model is partitioned across different trust domains (e.g., your servers + AWS), no single party has the full model. Combined with Merkle trees, this provides layered privacy:
127+
- **Partitioning** → limits what any single provider can access
128+
- **Merkle trees** → limits what proof readers can observe
129+
130+
#### Comparison with ZKTorch
131+
132+
| Aspect | distributed-zkml | ZKTorch |
133+
|--------|------------------|---------|
134+
| Scaling strategy | Horizontal (more machines via Ray) | Vertical (proof compression via Mira) |
135+
| Final output | N separate proofs | 1 accumulated proof |
136+
| Verification cost | O(N) proofs to verify | O(1) single proof |
137+
| Intermediate privacy | Merkle trees hide from proof readers | Exposed in proof |
138+
| Base system | halo2 (~30M param limit) | Custom pairing-based (6B params tested) |
139+
140+
**These approaches are orthogonal** — could theoretically combine Ray parallelism with Mira accumulation.
94141

95142
### Structure
96143

97-
\`\`\`
144+
```
98145
distributed-zkml/
99146
├── python/ # Python wrappers for Rust prover
100147
├── tests/ # Distributed proving tests
101148
└── zkml/ # zkml (modified for Merkle + chunking)
102149
├── src/bin/prove_chunk.rs
103150
└── testing/
104-
\`\`\`
151+
```
105152

106153
## Requirements
107154

@@ -115,12 +162,12 @@ Just Docker and Docker Compose. Everything else is in the container.
115162
|------------|-------|
116163
| Rust (nightly) | Install via [rustup](https://rustup.rs/) |
117164
| Python >=3.10 | |
118-
| pip | \`pip install -e .\` |
119-
| Build tools | Linux: \`build-essential pkg-config libssl-dev\`; macOS: Xcode CLI |
165+
| pip | `pip install -e .` |
166+
| Build tools | Linux: `build-essential pkg-config libssl-dev`; macOS: Xcode CLI |
120167

121-
**Python deps** (installed via \`pip install -e .\`):
122-
- \`ray[default]>=2.31.0\`
123-
- \`msgpack\`, \`numpy\`
168+
**Python deps** (installed via `pip install -e .`):
169+
- `ray[default]>=2.31.0`
170+
- `msgpack`, `numpy`
124171

125172
**Optional**: NVIDIA GPU + CUDA 12.x + ICICLE backend for GPU acceleration
126173

@@ -130,16 +177,16 @@ Just Docker and Docker Compose. Everything else is in the container.
130177

131178
### Docker
132179

133-
\`\`\`bash
180+
```bash
134181
docker compose build dev
135182
docker compose run --rm dev
136183
# Inside container:
137184
cd zkml && cargo test --test merkle_tree_test -- --nocapture
138-
\`\`\`
185+
```
139186

140187
### Native
141188

142-
\`\`\`bash
189+
```bash
143190
# Install Rust
144191
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
145192

@@ -148,7 +195,7 @@ cd zkml && rustup override set nightly && cargo build --release && cd ..
148195

149196
# Python deps
150197
pip install -e .
151-
\`\`\`
198+
```
152199

153200
---
154201

@@ -164,9 +211,9 @@ Uses [ICICLE](https://github.com/ingonyama-zk/icicle) for GPU-accelerated MSM (M
164211

165212
### Setup
166213

167-
\`\`\`bash
214+
```bash
168215
# 1. Download ICICLE backend (Ubuntu 22.04 - use ubuntu20 for 20.04)
169-
curl -L -o /tmp/icicle.tar.gz \\
216+
curl -L -o /tmp/icicle.tar.gz \
170217
https://github.com/ingonyama-zk/icicle/releases/download/v3.1.0/icicle_3_1_0-ubuntu22-cuda122.tar.gz
171218

172219
# 2. Install
@@ -180,13 +227,13 @@ cd zkml && cargo build --release --features gpu
180227

181228
# 5. Verify
182229
cargo test --test gpu_benchmark_test --release --features gpu -- --nocapture
183-
\`\`\`
230+
```
184231

185232
Expected output:
186-
\`\`\`
233+
```
187234
Registered devices: ["CUDA", "CPU"]
188235
Successfully set CUDA device 0
189-
\`\`\`
236+
```
190237

191238
### Benchmarks (T4)
192239

@@ -198,41 +245,41 @@ Successfully set CUDA device 0
198245

199246
### FFT/NTT Notes
200247

201-
- **Measure FFT time**: \`HALO2_FFT_STATS=1\`
202-
- **GPU NTT (experimental)**: \`HALO2_USE_GPU_NTT=1\` - currently slower due to conversion overhead
248+
- **Measure FFT time**: `HALO2_FFT_STATS=1`
249+
- **GPU NTT (experimental)**: `HALO2_USE_GPU_NTT=1` - currently slower due to conversion overhead
203250

204251
---
205252

206253
## Testing
207254

208255
### Distributed Proving
209256

210-
\`\`\`bash
257+
```bash
211258
# Simulation (fast)
212-
python tests/simple_distributed.py \\
213-
--model zkml/examples/mnist/model.msgpack \\
214-
--input zkml/examples/mnist/inp.msgpack \\
259+
python tests/simple_distributed.py \
260+
--model zkml/examples/mnist/model.msgpack \
261+
--input zkml/examples/mnist/inp.msgpack \
215262
--layers 4 --workers 2
216263

217264
# Real proofs
218265
python tests/simple_distributed.py ... --real
219-
\`\`\`
266+
```
220267

221268
### Rust Tests
222269

223-
\`\`\`bash
270+
```bash
224271
cd zkml
225272
cargo test --test merkle_tree_test --test chunk_execution_test -- --nocapture
226-
\`\`\`
273+
```
227274

228275
### CI
229276

230-
Runs on PRs to \`main\`/\`dev\`: builds zkml, runs tests (~3-4 min). GPU tests excluded to save costs.
277+
Runs on PRs to `main`/`dev`: builds zkml, runs tests (~3-4 min). GPU tests excluded to save costs.
231278

232279
---
233280

234281
## References
235282

236283
- [ZKML Paper](https://ddkang.github.io/papers/2024/zkml-eurosys.pdf) (EuroSys '24) - Original zkml framework
237284
- [zkml Repository](https://github.com/uiuc-kang-lab/zkml) - Base framework this project extends
238-
- [zk-torch](https://github.com/uiuc-kang-lab/zk-torch) - Alternative approach using proof accumulation/folding (from same research group)
285+
- [zk-torch](https://github.com/uiuc-kang-lab/zk-torch) - Alternative approach using proof accumulation/folding.

0 commit comments

Comments
 (0)