| title | Verifiable Aggregation Workflow | |||||
|---|---|---|---|---|---|---|
| url | https://github.com/rwilliamspbg-ops/Sovereign-Mohawk-Proto | |||||
| labels |
|
|||||
| dataset |
|
Note: If you use this baseline in your work, please cite Flower and any upstream work that inspired your implementation.
Paper/Reference: Sovereign-Mohawk-Proto
Authors: Community contribution by rwilliamspbg-ops
Abstract: This baseline demonstrates a reproducible federated learning workflow in Flower where standard FedAvg aggregation is augmented with optional server-side verification hooks. At each round, the server recomputes the weighted aggregate from raw client updates, compares it to the strategy output under a configurable tolerance, and records deterministic hashes and verification outcomes in a JSON report.
What is implemented: A minimal Message API Flower baseline with deterministic synthetic data, optional verification checks around aggregation outputs, and benchmark-friendly reporting scripts.
Datasets: Fully deterministic synthetic binary classification data generated per client partition.
Hardware Setup: CPU-only runs are supported. Default configuration (8 clients, 5 rounds) typically finishes in under a minute on a laptop-class CPU.
Contributors: rwilliamspbg-ops, Flower community maintainers
Task: Binary classification.
Model: Small MLP with two linear layers and one ReLU.
Dataset:
| Property | Value |
|---|---|
| Source | Generated on the fly (no downloads) |
| Features | 10 float features |
| Labels | Binary (0/1) |
| Clients | 8 by default |
| Local train examples/client | 128 |
| Local val examples/client | 64 |
| Partitioning | Deterministic client-specific distribution shift |
Training Hyperparameters (default):
| Hyperparameter | Value |
|---|---|
| num-server-rounds | 5 |
| num-clients | 8 |
| fraction-train | 1.0 |
| fraction-evaluate | 1.0 |
| local-epochs | 1 |
| learning-rate | 0.05 |
| batch-size | 32 |
| random-seed | 2026 |
| verify-aggregation | true |
| verification-tolerance | 1e-6 |
# Create the virtual environment
pyenv virtualenv 3.12.12 verifiableagg
# Activate it
pyenv activate verifiableagg
# Install baseline
pip install -e .
# If you are contributing changes and want to run lint/type checks
pip install -e ".[dev]"For contributor checks used in Flower baselines CI:
cd ..
./dev/test-baseline-structure.sh verifiableagg
./dev/test-baseline.sh verifiableagg# Run with defaults from pyproject.toml
flwr run .
# Override selected values from the CLI
flwr run . --run-config "num-server-rounds=10 verify-aggregation=true random-seed=2026"
# Run benchmark helper script (train + report check)
bash run_benchmark.shAfter each run, artifacts are written to the directory set by artifacts-dir (default: artifacts):
- artifacts/final_model.pt
- artifacts/report.json
The report includes:
- Effective run configuration
- Per-round aggregated train/eval metrics
- Per-round verification status (pass/fail)
- Maximum absolute replay difference
- Deterministic SHA256 hash of aggregated parameters per round
To summarize and validate verification outcomes:
python benchmark_report.py --report-path artifacts/report.jsonThis command exits non-zero if any round fails verification, which makes it suitable for CI or benchmark automation.
With default settings, all rounds should pass verification with very small numerical replay error (typically near machine precision). Example benchmark output:
round num_replies max_abs_diff passed
1 8 0.00000000e+00 1
2 8 0.00000000e+00 1
3 8 0.00000000e+00 1
4 8 0.00000000e+00 1
5 8 0.00000000e+00 1
All rounds verified. Max observed absolute difference: 0.00000000e+00