Skip to content

Conversation

@mariam851
Copy link
Contributor

@mariam851 mariam851 commented Dec 23, 2025

Hi @rasbt,

This PR addresses issue #1092 regarding incorrect indexing when using feature_groups in feature_importance_permutation.

Changes made:

Shuffling Logic: Instead of rng.shuffle, which failed on multi-column slices, I implemented rng.permutation(X.shape[0]) to ensure consistent row-wise shuffling across all columns within a group.

Module Export: Updated mlxtend/evaluate/init.py to correctly export feature_importance_permutation from the feature_importance.py module.

Verification (Jupyter Notebook Test):
To verify the fix, I tested the function with a group of non-informative features vs. one highly informative feature. The results confirmed that the shuffling is now working correctly for grouped indices.

Test Case:

Feature 1 is highly correlated with target. Features 2 & 3 are noise.

X = np.column_stack([feature_1, feature_2, feature_3])
feature_groups_idx = [0, [1, 2]]

mean_importance_vals, _ = feature_importance_permutation(
predict_method=model.predict, X=X, y=y, metric='r2',
num_rounds=10, feature_groups=feature_groups_idx, seed=42
)
Results:

Importance of Feature_1 (Group 0): 1.9084

Importance of Features 2&3 (Group 1): 0.0247

Status: SUCCESS (The model correctly identified the first group as the primary driver).

Note on Local Testing Issues:
During development, I encountered an issue where pytest could not locate the test files or the module in some Windows environments.

Error: file or directory not found: mlxtend/evaluate/tests/test_feature_importance.py

Reason: This appears to be a path resolution/PYTHONPATH issue on Windows and a naming inconsistency between the module (feature_importance.py) and the expected test imports.

Suggestions for Future Contributors:
Standardize Module Naming: Consider renaming feature_importance.py to feature_importance_permutation.py to match the User Guide and prevent ImportError.

Environment Setup: Always run tests using $env:PYTHONPATH = "." on Windows.

Expand Test Suite: Add a specific test case in test_feature_importance.py that handles 2D array slices to ensure no regressions in group shuffling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants