Skip to content

feat(baselines): add verifiable aggregation workflow baseline (Message API)#6892

Open
rwilliamspbg-ops wants to merge 7 commits intoflwrlabs:mainfrom
rwilliamspbg-ops:main
Open

feat(baselines): add verifiable aggregation workflow baseline (Message API)#6892
rwilliamspbg-ops wants to merge 7 commits intoflwrlabs:mainfrom
rwilliamspbg-ops:main

Conversation

@rwilliamspbg-ops
Copy link
Copy Markdown

Description
This PR adds a new Flower baseline that demonstrates a reproducible verifiable aggregation workflow using the Message API.
The goal is to provide an isolated community contribution under baselines that does not change Flower core behavior by default, while still showing how to add optional verification hooks around server-side aggregation outputs.

Related issues/PRs
Fixes #6880.
Related: #6881.

Proposal
Explanation
The PR introduces a self-contained baseline named verifiableagg with the standard baseline structure and contributor tooling support.

Main changes:

Added a new baseline package with Message API apps:
Client-side local train and evaluate app
Server-side app orchestration
FedAvg-derived strategy with optional aggregation verification hooks
Added deterministic and reproducible baseline setup:
Synthetic deterministic per-client data generation
Seeded defaults in configuration
Configurable verification tolerance and verification toggle
Added benchmark and reporting utilities:
Script to run and summarize benchmark output
JSON report containing run config, round metrics, verification pass or fail, max absolute difference, and aggregate hash
Added full baseline documentation:
Environment setup
Run instructions
Expected results
Contributor check commands
Added baseline-local ignore rules:
Ignore runtime artifacts so generated files are not committed
Validation performed:

Baseline structure check passed:
./dev/test-baseline-structure.sh verifiableagg
Baseline quality checks passed:
./dev/test-baseline.sh verifiableagg
Includes isort, black, docformatter, ruff, mypy, pylint, and pytest
Checklist
Implement proposed change
Write tests
Update documentation
Address LLM-reviewer comments, if applicable (e.g., GitHub Copilot)
Make CI checks pass
Ping maintainers on Slack (channel #contributions)
Any other comments?
This is intentionally a focused first-phase contribution with an isolated baseline and reproducible workflow.
If preferred, follow-up PRs can extend this baseline with larger-scale benchmark variants or additional verification mechanisms.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new self-contained verifiableagg baseline under baselines/ demonstrating a reproducible (synthetic, seeded) verifiable aggregation workflow using Flower’s Message API, without changing Flower core behavior.

Changes:

  • Introduces a Message API ServerApp/ClientApp baseline that runs FedAvg with optional post-aggregation verification (replay + tolerance + hashing).
  • Adds deterministic synthetic per-client dataset generation and JSON reporting utilities (plus a benchmark helper script).
  • Adds baseline packaging/configuration (pyproject.toml), docs, and baseline-local ignore rules.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
baselines/verifiableagg/verifiableagg/utils.py Adds as_bool helper for run_config parsing.
baselines/verifiableagg/verifiableagg/strategy.py Implements VerifiableFedAvg with replay-based aggregation verification and hashing.
baselines/verifiableagg/verifiableagg/server_app.py Orchestrates training, writes model + JSON report artifacts.
baselines/verifiableagg/verifiableagg/reporting.py Adds JSON report writer utility.
baselines/verifiableagg/verifiableagg/model.py Defines small MLP + train/eval loops for synthetic task.
baselines/verifiableagg/verifiableagg/dataset.py Implements deterministic synthetic per-partition dataset + loaders.
baselines/verifiableagg/verifiableagg/client_app.py Implements Message API client train/evaluate handlers.
baselines/verifiableagg/verifiableagg/init.py Package marker/docstring.
baselines/verifiableagg/run_benchmark.sh Convenience script to run baseline + summarize report.
baselines/verifiableagg/pyproject.toml Baseline dependencies, tooling config, Flower app/federation config defaults.
baselines/verifiableagg/benchmark_report.py CLI tool to tabulate verification rounds and exit non-zero on failures.
baselines/verifiableagg/README.md Baseline documentation (setup, run, expected outputs).
baselines/verifiableagg/.gitignore Ignores baseline runtime artifacts (e.g., artifacts/).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rwilliamspbg-ops
Copy link
Copy Markdown
Author

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Mar 30, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rwilliamspbg-ops
Copy link
Copy Markdown
Author

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@rwilliamspbg-ops
Copy link
Copy Markdown
Author

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rwilliamspbg-ops rwilliamspbg-ops requested a review from Copilot April 4, 2026 13:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rwilliamspbg-ops
Copy link
Copy Markdown
Author

@copilot apply changes based on the comments in this thread

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Baseline Proposal: Verifiable Aggregation Workflow (community contribution)

2 participants