-
Notifications
You must be signed in to change notification settings - Fork 223
Initial Feature Election algorithm version for Nvidia FLARE #3817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
christofilojohn
wants to merge
70
commits into
NVIDIA:main
Choose a base branch
from
christofilojohn:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+3,061
−0
Open
Changes from 16 commits
Commits
Show all changes
70 commits
Select commit
Hold shift + click to select a range
de8a877
[WIP] Initial Feature Election algorithm version for Nnidia FLARE
christofilojohn bb3332f
[WIP] Readme mistake
christofilojohn 345f430
Merge branch 'main' into main
christofilojohn f7ebada
Merge branch 'main' into main
chesterxgchen 973b818
Merge branch 'main' into main
chesterxgchen 9b53aad
Update examples/advanced/feature_election/requirements.txt
christofilojohn 1144051
Update nvflare/app_opt/feature_election/controller.py
christofilojohn ad4587a
Update nvflare/app_opt/feature_election/README.md
christofilojohn 56a5c91
Update nvflare/app_opt/feature_election/executor.py
christofilojohn cc50fbb
[WIP] comments cleanup
christofilojohn de4e8ff
[WIP] Implemented minor changes on imports and removing PPIMBC_Model.…
christofilojohn e63ab36
[WIP] Clarification, extra variable k - partition index in controller
christofilojohn cdf7372
[WIP] Another import restructure
christofilojohn 93de00b
Removed redundant components, added apache licence files
christofilojohn eff32d3
Update nvflare/app_opt/feature_election/executor.py
christofilojohn 5e15d7b
Merge branch 'main' into main
christofilojohn 10a5c26
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn 1adb62a
Update nvflare/app_opt/feature_election/executor.py
christofilojohn 454551c
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn 49ffa97
Pyimpetus cleanup on executor.py
christofilojohn 53015da
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn b8453dd
Remove unused imports, per greptile suggestions
christofilojohn 9b534e1
Minor cleanup, following greptile comments
christofilojohn f736c22
feature election masks now work with both True/False and 1/0 format, …
christofilojohn 5030be8
comment on feature election global mask process
christofilojohn 1418fd6
Skip PyImpetus test if the dependency is not installed
christofilojohn 7df272f
Update nvflare/app_opt/feature_election/executor.py
christofilojohn f052099
Update nvflare/app_opt/feature_election/controller.py
christofilojohn e7e4af1
Executor minor cleanup, with fixed imports
christofilojohn f184c15
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn aed3826
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn ac2012f
Update nvflare/app_opt/feature_election/README.md
christofilojohn ea22b32
moved import to top
christofilojohn bd0a6d9
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn e8bc20c
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn 07c2e53
Requirements newline
christofilojohn b8b292c
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn 3a0580e
fixed path on README
christofilojohn 60c413b
added empty init file to test folder for proper python packaging
christofilojohn 31f0792
Merge branch 'main' into main
christofilojohn 812a1ae
Moved tests inside feature election package
christofilojohn a0cc193
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn 6223d6f
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn f93c677
fixed boolean type safety errors, removed redundant installation note…
christofilojohn 6f3de84
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn 449a1ea
Update nvflare/app_opt/feature_election/controller.py
christofilojohn 0dace50
cleanup of imports
christofilojohn f78857c
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn 9197f4f
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn 3140b3d
removed empty line
christofilojohn a464527
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn c2df497
cleanup, removed installation notes which was redundant
christofilojohn 22f2f24
Merge branch 'main' into main
christofilojohn d00a48d
Merge branch 'main' into main
christofilojohn c34e384
fixed readme greptile recommendations
christofilojohn 757e7a1
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn 3eb7645
Merge branch 'main' into main
christofilojohn acde3a5
Merge branch 'main' into main
christofilojohn e86b2a5
Merge branch 'main' into main
christofilojohn 687a253
fixed load_client_data mistake, Improved more accurate README file
christofilojohn 5f5ebcc
changed evaluate_model method to public
christofilojohn 5923d66
Reformatted files based on CONTRIBUTING.md
christofilojohn 984991e
Update examples/advanced/feature_election/flare_deployment.py
christofilojohn c2a3552
Update nvflare/app_opt/feature_election/executor.py
christofilojohn 74e136b
documentation changes
christofilojohn b7f9b95
Merge branch 'main' into main
christofilojohn a208bb3
Merge branch 'main' into main
christofilojohn 24847fa
Merge branch 'main' into main
chesterxgchen 05616fb
cleanup on text, comments, newlines
christofilojohn 1cd10c9
Merge branch 'main' of https://github.com/christofilojohn/NVFlare
christofilojohn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # Installation Notes for NVIDIA FLARE Maintainers | ||
|
|
||
| ## Adding Feature Election to setup.py | ||
|
|
||
| When integrating this module, please add the following to NVFlare's `setup.py`: | ||
|
|
||
| ### In `extras_require`: | ||
| ```python | ||
| extras_require={ | ||
| # ... existing extras ... | ||
|
|
||
| "feature_election": [ | ||
| "scikit-learn>=1.0.0", | ||
| "PyImpetus>=0.0.6", # Optional advanced methods | ||
| ], | ||
|
|
||
| # Or split into basic/advanced | ||
| "feature_election_basic": [ | ||
| "scikit-learn>=1.0.0", | ||
| ], | ||
|
|
||
| "feature_election_advanced": [ | ||
| "scikit-learn>=1.0.0", | ||
| "PyImpetus>=0.0.6", | ||
| ], | ||
| } | ||
| ``` | ||
|
|
||
| ## User Installation | ||
|
|
||
| Then users can install with: | ||
| ```bash | ||
| # Basic (most common) | ||
| pip install nvflare[feature_election_basic] | ||
|
|
||
| # Advanced (with PyImpetus) | ||
| pip install nvflare[feature_election_advanced] | ||
|
|
||
| # Or install everything | ||
| pip install nvflare[feature_election] | ||
| ``` | ||
|
|
||
| ## Rationale | ||
|
|
||
| - scikit-learn is widely available | ||
| - PyImpetus is optional for advanced permutation-based feature selection | ||
| - Module works without PyImpetus (gracefully degrades to standard methods) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,263 @@ | ||
| # Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
|
|
||
| """ | ||
| Basic Usage Example for Feature Election in NVIDIA FLARE | ||
|
|
||
| This example demonstrates the simplest way to use Feature Election | ||
| for federated feature selection on tabular datasets. | ||
| """ | ||
|
|
||
| import pandas as pd | ||
| import numpy as np | ||
christofilojohn marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| from sklearn.datasets import make_classification | ||
| from sklearn.model_selection import train_test_split | ||
| from sklearn.ensemble import RandomForestClassifier | ||
| from sklearn.metrics import accuracy_score, f1_score | ||
| from nvflare.app_opt.feature_election import quick_election | ||
|
|
||
|
|
||
| def create_sample_dataset(): | ||
| """Create a sample high-dimensional dataset""" | ||
| X, y = make_classification( | ||
| n_samples=1000, | ||
| n_features=100, | ||
| n_informative=20, | ||
| n_redundant=30, | ||
| n_repeated=10, | ||
| random_state=42 | ||
| ) | ||
|
|
||
| # Create meaningful feature names | ||
| feature_names = [f"feature_{i:03d}" for i in range(100)] | ||
| df = pd.DataFrame(X, columns=feature_names) | ||
| df['target'] = y | ||
|
|
||
| print(f"Created dataset: {df.shape[0]} samples, {df.shape[1]-1} features") | ||
| return df | ||
|
|
||
|
|
||
| def example_1_quick_start(): | ||
| """Example 1: Quickstart - simplest usage""" | ||
| print("\n" + "="*60) | ||
| print("Example 1: Quick Start") | ||
| print("="*60) | ||
|
|
||
| # Create dataset | ||
| df = create_sample_dataset() | ||
|
|
||
| # Run Feature Election with just one line! | ||
| selected_mask, stats = quick_election( | ||
| df=df, | ||
| target_col='target', | ||
| num_clients=4, | ||
| fs_method='lasso', | ||
| auto_tune=True | ||
| ) | ||
|
|
||
| # Print results | ||
| print(f"\nOriginal features: {stats['num_features_original']}") | ||
| print(f"Selected features: {stats['num_features_selected']}") | ||
| print(f"Reduction: {stats['reduction_ratio']:.1%}") | ||
| print(f"Optimal freedom_degree: {stats['freedom_degree']:.2f}") | ||
|
|
||
| # Get selected feature names | ||
| feature_names = [col for col in df.columns if col != 'target'] | ||
| selected_features = [feature_names[i] for i, selected in enumerate(selected_mask) if selected] | ||
| print(f"\nFirst 10 selected features: {selected_features[:10]}") | ||
|
|
||
|
|
||
| def example_2_with_evaluation(): | ||
| """Example 2: With model evaluation""" | ||
| print("\n" + "="*60) | ||
| print("Example 2: With Model Evaluation") | ||
| print("="*60) | ||
|
|
||
| # Create dataset | ||
| df = create_sample_dataset() | ||
|
|
||
| # Split data | ||
| X = df.drop('target', axis=1) | ||
| y = df['target'] | ||
| X_train, X_test, y_train, y_test = train_test_split( | ||
| X, y, test_size=0.2, random_state=42, stratify=y | ||
| ) | ||
|
|
||
| # Prepare DataFrame for feature election (using training data only) | ||
| df_train = X_train.copy() | ||
| df_train['target'] = y_train | ||
|
|
||
| # Run Feature Election | ||
| selected_mask, stats = quick_election( | ||
| df=df_train, | ||
| target_col='target', | ||
| num_clients=4, | ||
| fs_method='lasso', | ||
| auto_tune=True | ||
| ) | ||
|
|
||
| # Apply mask to get selected features | ||
| X_train_selected = X_train.iloc[:, selected_mask] | ||
| X_test_selected = X_test.iloc[:, selected_mask] | ||
|
|
||
| # Train models | ||
| print("\nTraining models...") | ||
|
|
||
| # Model with all features | ||
| clf_all = RandomForestClassifier(n_estimators=100, random_state=42) | ||
| clf_all.fit(X_train, y_train) | ||
| y_pred_all = clf_all.predict(X_test) | ||
|
|
||
| # Model with selected features | ||
| clf_selected = RandomForestClassifier(n_estimators=100, random_state=42) | ||
| clf_selected.fit(X_train_selected, y_train) | ||
| y_pred_selected = clf_selected.predict(X_test_selected) | ||
|
|
||
| # Compare results | ||
| print("\nResults:") | ||
| print("-" * 60) | ||
| print(f"{'Metric':<20} {'All Features':<20} {'Selected Features':<20}") | ||
| print("-" * 60) | ||
| print(f"{'Accuracy':<20} {accuracy_score(y_test, y_pred_all):<20.4f} {accuracy_score(y_test, y_pred_selected):<20.4f}") | ||
| print(f"{'F1 Score':<20} {f1_score(y_test, y_pred_all):<20.4f} {f1_score(y_test, y_pred_selected):<20.4f}") | ||
| print(f"{'# Features':<20} {X_train.shape[1]:<20} {X_train_selected.shape[1]:<20}") | ||
| print("-" * 60) | ||
|
|
||
|
|
||
| def example_3_custom_configuration(): | ||
| """Example 3: Custom configuration""" | ||
| print("\n" + "="*60) | ||
| print("Example 3: Custom Configuration") | ||
| print("="*60) | ||
|
|
||
| from nvflare.app_opt.feature_election import FeatureElection | ||
christofilojohn marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Create dataset | ||
| df = create_sample_dataset() | ||
|
|
||
| # Initialize with custom parameters | ||
| fe = FeatureElection( | ||
| freedom_degree=0.6, | ||
| fs_method='elastic_net', | ||
| aggregation_mode='weighted' | ||
| ) | ||
|
|
||
| # Prepare data splits | ||
| client_data = fe.prepare_data_splits( | ||
| df=df, | ||
| target_col='target', | ||
| num_clients=5, | ||
| split_strategy='stratified' | ||
| ) | ||
christofilojohn marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| print(f"Prepared data for {len(client_data)} clients") | ||
| for i, (X, y) in enumerate(client_data): | ||
| print(f" Client {i+1}: {len(X)} samples, class distribution: {y.value_counts().to_dict()}") | ||
|
|
||
| # Run election | ||
| stats = fe.simulate_election(client_data) | ||
christofilojohn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Print results | ||
| print(f"\nElection Results:") | ||
| print(f" Features selected: {stats['num_features_selected']}/{stats['num_features_original']}") | ||
| print(f" Reduction: {stats['reduction_ratio']:.1%}") | ||
| print(f" Intersection features: {stats['intersection_features']}") | ||
| print(f" Union features: {stats['union_features']}") | ||
|
|
||
| # Print client statistics | ||
| print(f"\nPer-Client Statistics:") | ||
| for client_name, client_stats in stats['client_stats'].items(): | ||
| print(f" {client_name}:") | ||
| print(f" Features selected: {client_stats['num_selected']}") | ||
| print(f" Score improvement: {client_stats['improvement']:+.4f}") | ||
|
|
||
| # Save results | ||
| fe.save_results("feature_election_results.json") | ||
christofilojohn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| print("\n✓ Results saved to feature_election_results.json") | ||
|
|
||
|
|
||
| def example_4_different_methods(): | ||
| """Example 4: Compare different feature selection methods""" | ||
| print("\n" + "="*60) | ||
| print("Example 4: Comparing Different FS Methods") | ||
| print("="*60) | ||
|
|
||
| # Create dataset | ||
| df = create_sample_dataset() | ||
|
|
||
| methods = ['lasso', 'elastic_net', 'random_forest', 'mutual_info', 'f_classif'] | ||
| results = {} | ||
|
|
||
| for method in methods: | ||
| print(f"\nTesting {method}...") | ||
| selected_mask, stats = quick_election( | ||
| df=df, | ||
| target_col='target', | ||
| num_clients=4, | ||
| fs_method=method, | ||
| auto_tune=False, | ||
| freedom_degree=0.5 | ||
| ) | ||
|
|
||
| results[method] = { | ||
| 'selected': stats['num_features_selected'], | ||
| 'reduction': stats['reduction_ratio'], | ||
| 'intersection': stats['intersection_features'], | ||
| 'union': stats['union_features'] | ||
| } | ||
|
|
||
| # Display comparison | ||
| print("\n" + "="*60) | ||
| print("Method Comparison") | ||
| print("="*60) | ||
| print(f"{'Method':<15} {'Selected':<12} {'Reduction':<12} {'Intersection':<12} {'Union':<10}") | ||
| print("-" * 60) | ||
| for method, res in results.items(): | ||
| print(f"{method:<15} {res['selected']:<12} {res['reduction']:<11.1%} {res['intersection']:<12} {res['union']:<10}") | ||
|
|
||
|
|
||
| def main(): | ||
| """Run all examples""" | ||
| print("\n" + "="*70) | ||
| print(" Feature Election for NVIDIA FLARE - Basic Examples") | ||
| print("="*70) | ||
|
|
||
| try: | ||
| example_1_quick_start() | ||
| except Exception as e: | ||
| print(f"Example 1 failed: {e}") | ||
|
|
||
| try: | ||
| example_2_with_evaluation() | ||
| except Exception as e: | ||
| print(f"Example 2 failed: {e}") | ||
|
|
||
| try: | ||
| example_3_custom_configuration() | ||
| except Exception as e: | ||
| print(f"Example 3 failed: {e}") | ||
|
|
||
| try: | ||
| example_4_different_methods() | ||
| except Exception as e: | ||
| print(f"Example 4 failed: {e}") | ||
|
|
||
| print("\n" + "="*70) | ||
| print(" All examples completed!") | ||
| print("="*70) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.