Skip to content

Conversation

@cyyever
Copy link
Contributor

@cyyever cyyever commented Sep 7, 2025

Description

Add the FedOBD to research. Experiment results will be added ASAP.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

@cyyever cyyever marked this pull request as draft September 8, 2025 08:13
@cyyever cyyever marked this pull request as ready for review September 17, 2025 04:53
Copy link
Collaborator

@ZiyueXu77 ZiyueXu77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall good, minor fixes

@@ -0,0 +1,4 @@
nvflare~=2.5.0rc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2.5 won't work, we can remove the version number

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 17, 2025

Greptile Overview

Greptile Summary

Adds FedOBD (IJCAI 2023) research implementation to the research directory, including documentation and requirements. The ADAQUANT quantization scheme implementation already exists in nvflare/app_opt/pt/quantization/ada_quant.py.

Key changes:

  • Updated research/README.md with FedOBD entry and renumbered all entries (01-10)
  • Added research/fedobd/README.md with paper abstract, setup instructions, and usage examples
  • Added research/fedobd/requirements.txt with dependencies

Issues found:

  • Incorrect script name in README (llm_hf_fl_job.py should be job.py)
  • Previous reviewer flagged markdown link formatting and missing code fences (already noted in previous threads)

Confidence Score: 3/5

  • Safe to merge after fixing the script name - documentation-only change with one critical typo
  • The PR adds documentation for an existing implementation (ADAQUANT quantization is already in the codebase). The only critical issue is the incorrect script filename that would prevent users from running the example. Other issues noted in previous threads (markdown formatting, code fences) are style issues
  • research/fedobd/README.md requires correction of script name on line 46

Important Files Changed

File Analysis

Filename Score Overview
research/README.md 5/5 Added FedOBD entry to research implementations list with proper numbering (renumbered all entries from 01-10)
research/fedobd/README.md 2/5 New README for FedOBD implementation. Contains incorrect script name llm_hf_fl_job.py (should be job.py), broken markdown link, and missing code fences for bash commands
research/fedobd/requirements.txt 5/5 Simple requirements file with nvflare, torch, torchvision, and tensorboard dependencies

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Repo as Repository
    participant Research as research/
    participant FedOBD as research/fedobd/
    participant Quantizer as nvflare/app_opt/pt/quantization/
    
    Dev->>Research: Add FedOBD to research list
    Dev->>FedOBD: Create README.md
    Dev->>FedOBD: Create requirements.txt
    Note over FedOBD: References ADAQUANT implementation
    FedOBD-->>Quantizer: Uses ada_quant.py
    Note over Quantizer: ADAQUANT already implemented<br/>in nvflare codebase
    Dev->>Repo: Update research/README.md
    Note over Repo: Renumber all entries 01-10
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format


## Steps to run the code

Let's follow the steps in the [quantization examples] (https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/llm_hf).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: markdown link has space before URL which breaks the link

Suggested change
Let's follow the steps in the [quantization examples] (https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/llm_hf).
Let's follow the steps in the [quantization examples](https://github.com/NVIDIA/NVFlare/tree/main/examples/advanced/llm_hf).

Comment on lines +30 to +46
### Data Preparation
cd examples/advanced/llm_hf

mkdir dataset
cd dataset
git clone https://huggingface.co/datasets/tatsu-lab/alpaca
git clone https://huggingface.co/datasets/databricks/databricks-dolly-15k
git clone https://huggingface.co/datasets/OpenAssistant/oasst1
cd ..
mkdir dataset/dolly
python ./utils/preprocess_dolly.py --training_file dataset/databricks-dolly-15k/databricks-dolly-15k.jsonl --output_dir dataset/dolly
python ./utils/preprocess_alpaca.py --training_file dataset/alpaca/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet --output_dir dataset/alpaca
python ./utils/preprocess_oasst1.py --training_file dataset/oasst1/data/train-00000-of-00001-b42a775f407cee45.parquet --validation_file dataset/oasst1/data/validation-00000-of-00001-134b8fd0c89408b6.parquet --output_dir dataset/oasst1

### Run ADAQUANT

python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: bash commands not wrapped in code block - should use ```bash fencing for proper formatting

Suggested change
### Data Preparation
cd examples/advanced/llm_hf
mkdir dataset
cd dataset
git clone https://huggingface.co/datasets/tatsu-lab/alpaca
git clone https://huggingface.co/datasets/databricks/databricks-dolly-15k
git clone https://huggingface.co/datasets/OpenAssistant/oasst1
cd ..
mkdir dataset/dolly
python ./utils/preprocess_dolly.py --training_file dataset/databricks-dolly-15k/databricks-dolly-15k.jsonl --output_dir dataset/dolly
python ./utils/preprocess_alpaca.py --training_file dataset/alpaca/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet --output_dir dataset/alpaca
python ./utils/preprocess_oasst1.py --training_file dataset/oasst1/data/train-00000-of-00001-b42a775f407cee45.parquet --validation_file dataset/oasst1/data/validation-00000-of-00001-134b8fd0c89408b6.parquet --output_dir dataset/oasst1
### Run ADAQUANT
python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant
### Data Preparation
```bash
cd examples/advanced/llm_hf
mkdir dataset
cd dataset
git clone https://huggingface.co/datasets/tatsu-lab/alpaca
git clone https://huggingface.co/datasets/databricks/databricks-dolly-15k
git clone https://huggingface.co/datasets/OpenAssistant/oasst1
cd ..
mkdir dataset/dolly
python ./utils/preprocess_dolly.py --training_file dataset/databricks-dolly-15k/databricks-dolly-15k.jsonl --output_dir dataset/dolly
python ./utils/preprocess_alpaca.py --training_file dataset/alpaca/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet --output_dir dataset/alpaca
python ./utils/preprocess_oasst1.py --training_file dataset/oasst1/data/train-00000-of-00001-b42a775f407cee45.parquet --validation_file dataset/oasst1/data/validation-00000-of-00001-134b8fd0c89408b6.parquet --output_dir dataset/oasst1

Run ADAQUANT

python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant


## Implementation

A quantization scheme called "ADAQUANT" has been added to NVFLARE under **nvflare/app_opt/pt/quantization**, which is based on our [official implementation](https://github.com/cyyever/distributed_learning_simulator).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: README claims ADAQUANT quantization scheme was added to nvflare/app_opt/pt/quantization, but that directory only contains generic quantization code with no "adaquant" mode. The quantizer only supports FLOAT16, BLOCKWISE8, FLOAT4, and NORMFLOAT4. Either the implementation is missing or the claim is inaccurate

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile


### Run ADAQUANT

python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: incorrect script name - should be job.py not llm_hf_fl_job.py

Suggested change
python3 llm_hf_fl_job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant
python3 job.py --client_ids dolly --data_path ${PWD}/dataset --workspace_dir ${PWD}/workspace/hf_sft_adaquant --job_dir ${PWD}/workspace/jobs/hf_sft_adaquant --train_mode SFT --quantize_mode adaquant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants