Neuroflare Computation for Decentralized Combat Algorithm

📑 Table of Contents

Requirements
Overview
Data Format Specification
- Covariates File
  - General Structure
- Dependent Variables File
  - General Structure
Configurations
- Parameters file
  - Schema
- Logger Configurations
  - Dev Environment
  - Production Environment
Output
Developer Instructions

Requirements:

Python 3.8.17
NvFlare 2.4.0

Overview:

Decentralized ComBat is a privacy‑preserving tool that harmonizes neuroimaging data stored at multiple labs without ever copying raw files to a central server. Each site runs the ComBat math locally, shares only encrypted summary statistics with a lightweight aggregator, and then adjusts its data using the combined grand mean and variance. The result is a dataset that is statistically “site‑neutral,” giving analyses the same power and consistency as traditional, centralized ComBat while sidestepping legal, storage, and security hurdles. Tested on traumatic‑brain‑injury studies and large‑scale simulations, the method matches centralized results, scales cleanly to many sites, and lets researchers blend public and private datasets that previously could not be combined. In short, Decentralized ComBat makes multi‑center neuroimaging studies easier, safer, and more statistically robust.

Below are the key steps in the algorithm:

Participants:

In our decentralized environment, we have two types of nodes: The first type is the aggregator node, also known as the remote node, which does not hold any data and acts as a storage of intermediate results and performs simple operations such as aggregation. The second node type is the local/regional node where datasets are located.

Stage 1 - Local Summary Extraction:

Each participating site runs COINSTAC’s decentralized regression to obtain initial β‑coefficients.
Using those coefficients, the site computes its local mean and local variance.
These summary statistics—never raw data—are securely sent to the remote aggregator node.

Stage 2 - Global Statistics Aggregation:

The aggregator combines all incoming summaries to derive the grand mean and grand variance across sites.
It broadcasts those global values back to every local node.

Stage 3 - Site-wise Harmonization:

Each node uses the grand statistics to standardize its own dataset.
It then estimates site‑specific effects via parametric empirical Bayes and adjusts its data accordingly.
The result: harmonized, site‑neutral data that remain in place and ready for pooled analysis.

Data Format Specification:

The computation requires two csv files as input:

Covariates File (CatCovariates.csv)
Dependent Variables File (Data.csv)

Both files must follow a consistent format, though the specific covariates and dependents may vary from study to study. The computation expects these files to match the covariate and dependent variable names specified in the parameters.json file.

Covariates File:

The key covariate_file in the parameters.json should match the file_name in local site.

Example: test_data/site1/CatCovariate.csv

Format: CSV (Comma-Separated Values)
Headers: The file must include a header row where each column name corresponds to a covariate specified in the parameters.json.
Rows: Each row represents a subject, where each column contains the value for a specific covariate.
Variable Names: The names of the covariates in the header must match the entries in the covariates_types section of the parameters.json.

General Structure:

<Covariate_1>,<Covariate_2>,...,<Covariate_N>
<value_1>,<value_2>,...,<value_N>
<value_1>,<value_2>,...,<value_N>
...

Dependent Variables File:

The key data_file in the parameters.json should match the file_name in local site.

Format: CSV (Comma-Separated Values)
Headers: The file must include a header row where each column name corresponds to a ROI in brain region.
Rows: Each row represents the same subject as in the covariates.csv, with values for the dependent variables.

General Structure:

<Dependent_1>,<Dependent_2>,...,<Dependent_N>
<value_1>,<value_2>,...,<value_N>
<value_1>,<value_2>,...,<value_N>
...

Assumptions

The data provided by each site follows the specified format (standardized covariate and dependent variable headers).
The computation is run in a federated environment, and each site contributes valid data.

Configurations:

Parameters file:

This file is loaded by combat_controller.py on the remote node, which then passes it to the edge nodes(executors) in the computation as FLContext Object.

Example: test_data/server/parameters.json

Schema

Key	Type	Required	Description	Example
`covariate_file`	`string`	✅	Covariate file name inside edge node data directory	`"CatCovariate.csv"`
`data_file`	`string`	✅	Dependent file name inside edge node data directory	`"Data.csv"`
`combat_algo`	`string`	✅	Which type of algorithm to implement during computation	`combatDC` or `combatMegaDC`
`covariates_types`	`object`	✅	Datatypes of each column values in covariates file	`3`
`covariates_types.['key_name']`	`string`	✅	primitive datatype names supported in `Python 3.8`	`int`, `float`, `string` or `bool`

Note: In the dependent file, each cell value is assumed to be either empty or of type float.

Logger Configurations:

The computation creates three categories of log files. First is site logs, which are under test_output/{site_name}/{site_name}.log. Second is remote logs, which are under test_output/remote/remote.log which are basically controller logs. Finally, the aggregator log file is stored in the same location as the remote logs and is specific to the aggregator computation.

Dev Environment:

Set the environment variable LOG_LEVEL with supported values as info, debug, error or warning, in dockerRun.sh

Production Environment:

Pass the environment variable LOG_LEVEL to the application with supported values as info, debug, error or warning in docker run command.

Output:

Once the computation is completed, it generates the harmonized, site‑dependent CSV files in the test_output/{site_name} directory.

Developer Instructions:

Clone the repository
Build the Docker image with the command below:

docker build . -t nvflare-dccombat -f Dockerfile-dev
The above command generates a Docker image with tag nvflare-dccombat.
Start the docker container with ./dockerRun.sh command. Provide necessary execute permission for the above file.
The above will open a shell inside the container. Run the following command to run the computation:

nvflare simulator -c site1,site2 ./app/
Make changes as needed and repeat step 5 to test them.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
other_references		other_references
system		system
test_data		test_data
.gitignore		.gitignore
Dockerfile-dev		Dockerfile-dev
Dockerfile-prod		Dockerfile-prod
LICENSE		LICENSE
README.md		README.md
debug.py		debug.py
dockerRun.sh		dockerRun.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neuroflare Computation for Decentralized Combat Algorithm

📑 Table of Contents

Requirements:

Overview:

Participants:

Stage 1 - Local Summary Extraction:

Stage 2 - Global Statistics Aggregation:

Stage 3 - Site-wise Harmonization:

Data Format Specification:

Covariates File:

General Structure:

Dependent Variables File:

General Structure:

Assumptions

Configurations:

Parameters file:

Schema

Logger Configurations:

Dev Environment:

Production Environment:

Output:

Developer Instructions:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

trendscenter/neuroflame_computation_combatdc

Folders and files

Latest commit

History

Repository files navigation

Neuroflare Computation for Decentralized Combat Algorithm

📑 Table of Contents

Requirements:

Overview:

Participants:

Stage 1 - Local Summary Extraction:

Stage 2 - Global Statistics Aggregation:

Stage 3 - Site-wise Harmonization:

Data Format Specification:

Covariates File:

General Structure:

Dependent Variables File:

General Structure:

Assumptions

Configurations:

Parameters file:

Schema

Logger Configurations:

Dev Environment:

Production Environment:

Output:

Developer Instructions:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages