@@ -21,8 +21,6 @@ Extension of [zkml](https://github.com/uiuc-kang-lab/zkml) for distributed provi
2121 - [ Structure] ( #structure )
2222- [ Quick Start] ( #quick-start )
2323- [ Testing and CI] ( #testing-and-ci )
24- - [ General] ( #general )
25- - [ Testing on AWS GPU Instances] ( #testing-on-aws-gpu-instances )
2624 - [ CI] ( #ci )
2725- [ References] ( #references )
2826
@@ -157,198 +155,36 @@ python3 tests/simple_distributed.py \
157155
158156## Testing and CI
159157
160- ### General
158+ ### Distributed Proving Test
161159
162- #### Python Tests (pytest)
160+ Test the distributed proving pipeline:
163161
164- ##### Run All Tests
165162``` bash
166- pytest tests/
167- ```
168-
169- ##### Run specific GPU and AWS tests
170- ``` bash
171- pytest tests/aws/gpu_test.py
172- pytest tests/aws/gpu_test.py::test_aws_credentials
173- ```
174-
175- #### Rust Tests (Cargo)
176-
177- ##### Run All Tests in zkml
178- ``` bash
179- cd zkml
180- # Run only the test files (recommended)
181- cargo test --test merkle_tree_test --test chunk_execution_test
182-
183- # Note: Running `cargo test` without --test flags will try to compile examples,
184- # some of which may have errors. Use --test flags to run specific tests.
185- ```
186-
187- ##### Run Specific Test File
188- ``` bash
189- cd zkml
190- cargo test --test merkle_tree_test
191- cargo test --test chunk_execution_test
192- ```
163+ # Simulation mode (fast, no real proofs)
164+ python tests/simple_distributed.py \
165+ --model zkml/examples/mnist/model.msgpack \
166+ --input zkml/examples/mnist/inp.msgpack \
167+ --layers 4 --workers 2
193168
194- ##### Run Tests with Output
195- ``` bash
196- cd zkml
197- cargo test --test merkle_tree_test --test chunk_execution_test -- --nocapture
169+ # Real mode (generates actual ZK proofs)
170+ python tests/simple_distributed.py \
171+ --model zkml/examples/mnist/model.msgpack \
172+ --input zkml/examples/mnist/inp.msgpack \
173+ --layers 4 --workers 2 --real
198174```
199175
200- ##### Run Tests for distributed-zkml Crate
201- ``` bash
202- # From distributed-zkml root
203- cargo test
204- ```
176+ ### Rust Tests
205177
206- ##### Check Compilation Only
207178``` bash
208179cd zkml
209- cargo check --lib
210- ```
211-
212- Broken example files are moved to ` zkml/examples/broken/ ` to prevent compilation errors. Use ` --test ` flags when running tests.
213-
214- #### Test Files
215-
216- ##### Python Tests
217- - ` tests/aws/gpu_test.py ` - AWS GPU tests
218- - ` test_aws_credentials() ` - Check AWS credentials
219- - ` test_gpu_availability() ` - Check GPU availability
220- - ` test_ray_setup() ` - Test Ray cluster setup
221-
222- ##### Rust Tests
223- - ` zkml/testing/merkle_tree_test.rs ` - Merkle tree tests
224- - ` test_merkle_single_value() ` - Single value Merkle tree
225- - ` test_merkle_multiple_values() ` - Multiple values Merkle tree
226- - ` test_merkle_root_verification() ` - Root verification
227-
228- - ` zkml/testing/chunk_execution_test.rs ` - Chunk execution tests
229- - ` test_chunk_execution_intermediate_values() ` - Extract intermediate values
230- - ` test_chunk_execution_with_merkle() ` - Chunk execution with Merkle
231- - ` test_multiple_chunks_consistency() ` - Multiple chunks consistency
232-
233- ### Testing on AWS GPU Instances
234-
235- #### Prerequisites
236-
237- ##### AWS Credentials
238-
239- Set the following environment variables:
240180
241- ``` bash
242- export AWS_ACCESS_KEY_ID=your_access_key
243- export AWS_SECRET_ACCESS_KEY=your_secret_key
244- export AWS_SESSION_TOKEN=your_session_token
245- ```
246-
247- ##### AWS Resource Configuration
248-
249- ** Option 1: Automated Setup (Recommended)**
250-
251- Run the setup script to automatically get/create resources:
252-
253- ``` bash
254- # Optional: Set custom resource names for auto-detection
255- export AWS_KEY_NAME=your-key-name # Optional: for auto-detection
256- export AWS_SECURITY_GROUP_NAME=your-sg-name # Optional: for auto-detection
257-
258- # Run setup script (will prompt or create resources)
259- ./tests/aws/setup_aws_resources.sh
260-
261- # Copy the export commands it shows, then set:
262- export KEY_NAME=your-key-name # Required: from setup script
263- export SECURITY_GROUP=sg-xxxxx # Required: from setup script
264- ```
265-
266- ** Option 2: Manual Configuration**
267-
268- Set all required variables manually:
269-
270- ``` bash
271- # Required: AWS credentials
272- export AWS_ACCESS_KEY_ID=your_access_key
273- export AWS_SECRET_ACCESS_KEY=your_secret_key
274- export AWS_SESSION_TOKEN=your_session_token
275-
276- # Required: Resource identifiers
277- export KEY_NAME=your-key-name # Your EC2 key pair name
278- export SECURITY_GROUP=sg-xxxxx # Your security group ID
279-
280- # Optional: Override defaults
281- export AWS_REGION=us-west-2 # Default: us-west-2
282- export INSTANCE_TYPE=g5.xlarge # Default: g5.xlarge
283- export AMI_ID=ami-0076e7fffffc9251d # Default: Ubuntu 20.04, PyTorch 2.3.1
284- ```
285-
286- ##### GPU Instance
181+ # Run all tests
182+ cargo test --test merkle_tree_test --test chunk_execution_test --test test_merkle_root_public -- --nocapture
287183
288- Launch an AWS instance with GPU support:
289- - A100: ` g5.xlarge ` or larger (1x A100)
290- - H100: ` p5.48xlarge ` (8x H100)
291-
292- ##### Dependencies
293-
294- ``` bash
295- # Install Rust (nightly)
296- curl --proto ' =https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
297- rustup override set nightly
298-
299- # Install Python dependencies
300- pip install ray torch
301-
302- # Install CUDA drivers (usually pre-installed on GPU instances)
303- nvidia-smi # Verify GPU is available
304- ```
305-
306- #### Test Suite
307-
308- The test suite includes:
309-
310- 1 . AWS Credentials Check: Validates required environment variables
311- 2 . GPU Availability Check: Verifies GPU is accessible via ` nvidia-smi `
312- 3 . Ray Cluster Setup: Initializes Ray with GPU support
313- 4 . Basic GPU Distribution: Tests task distribution across GPU workers
314- 5 . Distributed Proving Simulation: Runs distributed proving with Merkle trees
315-
316- #### Expected Output
317-
318- ```
319- ============================================================
320- AWS GPU Tests for Distributed Proving
321- ============================================================
322- INFO: AWS credentials found
323- INFO: GPU detected
324- INFO: Ray initialized with 1 GPU(s)
325-
326- --- Running: Basic GPU Distribution ---
327- INFO: Testing GPU distribution with 2 workers
328- INFO: Completed 4 tasks
329- INFO: Task 0: Worker 0, GPU 0, Time: 2.34ms
330- ...
331-
332- --- Running: Distributed Proving Simulation ---
333- INFO: Testing distributed proving simulation
334- INFO: Distributed proving completed: 2 chunks
335- INFO: Chunk 0: success
336- INFO: Chunk 1: success
337-
338- ============================================================
339- Test Summary
340- ============================================================
341- Basic GPU Distribution: PASS
342- Distributed Proving Simulation: PASS
184+ # Run specific test
185+ cargo test --test merkle_tree_test -- --nocapture
343186```
344187
345- #### Performance Notes
346-
347- - A100: ~ 40GB VRAM, suitable for large models
348- - H100: ~ 80GB VRAM, suitable for very large models
349- - Ray automatically distributes tasks across available GPUs
350- - Monitor GPU usage: ` watch -n 1 nvidia-smi `
351-
352188### CI
353189
354190Lightweight CI runs on every PR to ` main ` and ` dev ` :
0 commit comments