Skip to content

Commit 60c917a

Browse files
YuanTingHsiehchesterxgchenclaude
authored
Cherry-pick [2.7] Pin pandas<3.0 and fix pandas 3.x compatibility in federated statistics (#4227) (#4292)
## Summary Starting from pandas 3.0, two breaking changes caused **all federated statistics jobs to fail** with `EXECUTION_RESULT_ERROR` on every client: **Issue 1: `AttributeError: 'StringDtype' object has no attribute 'char'`** pandas 3.0 enables `future.infer_string` by default, so `pd.read_csv()` infers string columns as `pd.StringDtype` — a pandas ExtensionDtype that has no `.char` attribute. This crashed `dtype_to_data_type()` in `numpy_utils.py` immediately on any DataFrame with string columns. **Issue 2: `AttributeError: 'Series' object has no attribute 'ravel'`** `Series.ravel()` was deprecated in pandas 2.2.0 and **removed** in pandas 3.0. Every `histogram()` method in the statistics module called `feature.ravel()` on a pandas Series. Both errors produced empty statistics output `{}` and affected `df_stats_job`, `df_stats_job_scale`, and `hierarchical_stats` jobs. ## Changes **Guard / Pin (prevents pandas 3.0 from being installed):** - Pin `pandas~=2.3` across 16 `requirements.txt` files (15 previously unpinned + new `df_stats/requirements.txt`) **Code fixes (forward-compatible with pandas 3.x when pin is eventually lifted):** - `nvflare/app_common/statistics/numpy_utils.py` — add `hasattr(dtype, "char")` guard in `dtype_to_data_type()`; ExtensionDtypes without `.char` (e.g. `StringDtype`) fall through to `DataType.STRING` - `nvflare/app_opt/statistics/df/df_core_statistics.py` — `feature.ravel()` → `feature.to_numpy()` - `examples/advanced/federated-statistics/hierarchical_stats/.../hierarchical_stats.py` — `feature.ravel()` → `feature.to_numpy()` - `tests/unit_test/app_common/executors/statistics/mock_df_stats_executor.py` — `feature.ravel()` → `feature.to_numpy()` **New file:** - `examples/advanced/federated-statistics/df_stats/requirements.txt` — was missing; created from `image_stats/requirements.txt` minus image-specific deps (`monai`, `kaleido`, `kagglehub`) **New tests:** - `tests/unit_test/app_common/statistics/numpy_utils_test.py` — 11 tests covering all `dtype_to_data_type()` branches including the new `pd.StringDtype` case ## Test plan - [ ] All 11 new unit tests in `numpy_utils_test.py` pass - [ ] Existing `statistics_executor_test.py` tests continue to pass (histogram path exercises `feature.to_numpy()`) - [ ] Run `df_stats_job` and `hierarchical_stats` jobs end-to-end with pandas 2.3 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Fixes # . ### Description A few sentences describing the changes proposed in this pull request. ### Types of changes <!--- Put an `x` in all the boxes that apply, and remove the not applicable items --> - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Quick tests passed locally by running `./runtest.sh`. - [ ] In-line docstrings updated. - [ ] Documentation updated. Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 7886405 commit 60c917a

File tree

22 files changed

+100
-22
lines changed

22 files changed

+100
-22
lines changed

examples/advanced/cifar10/pt/cifar10-sim/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@ torchvision
44
tensorboard
55
matplotlib
66
seaborn
7-
pandas
7+
pandas~=2.3
88
tbparse
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
nvflare~=2.7.2rc
2+
numpy
3+
pandas~=2.3
4+
matplotlib
5+
jupyter
6+
notebook

examples/advanced/federated-statistics/hierarchical_stats/jobs/hierarchical_stats/app/custom/hierarchical_stats.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ def histogram(
105105

106106
df = self.data[dataset_name]
107107
feature: Series = df[feature_name]
108-
flattened = feature.ravel()
108+
flattened = feature.to_numpy()
109109
flattened = flattened[flattened != np.array(None)]
110110
buckets = get_std_histogram_buckets(flattened, num_of_bins, BinRange(global_min_value, global_max_value))
111111
return Histogram(HistogramType.STANDARD, buckets)
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
nvflare~=2.7.0rc
22
ipywidgets
33
numpy
4-
pandas
4+
pandas~=2.3
55
matplotlib
66
jupyterlab
77
jupyter

examples/advanced/federated-statistics/image_stats/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
nvflare~=2.7.2rc
22
numpy
33
monai[itk]
4-
pandas
4+
pandas~=2.3
55
kaleido
66
matplotlib
77
jupyter

examples/advanced/gnn/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ tensorboard
55
scikit-learn
66
tqdm
77
filelock
8-
pandas
8+
pandas~=2.3
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
nvflare~=2.7.2rc
2-
pandas
2+
pandas~=2.3
33
scikit-learn
44
joblib

examples/advanced/monai/spleen_ct_segmentation/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ monai>=1.3.0
66
nvflare[PT]>=2.7.2rc
77
tensorboard
88
scikit-image
9-
pandas
9+
pandas~=2.3
1010
matplotlib
1111
monai[ignite,tqdm]
1212
huggingface_hub
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
nvflare~=2.7.2rc
22
openmined-psi>=2.0.5
3-
pandas
3+
pandas~=2.3
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
nvflare~=2.7.2rc
2-
pandas
2+
pandas~=2.3
33
scikit-learn
44
joblib
55
tensorboard

0 commit comments

Comments
 (0)