Add RuleExtractor transformer for sklearn pipeline compatibility #1105#1146
Open
mariam851 wants to merge 2 commits intorasbt:masterfrom
Open
Add RuleExtractor transformer for sklearn pipeline compatibility #1105#1146mariam851 wants to merge 2 commits intorasbt:masterfrom
mariam851 wants to merge 2 commits intorasbt:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi @rabst,
This Pull Request addresses issue #1105 by introducing a new RuleExtractor transformer. This enables seamless integration of association rule mining (Apriori/Association Rules) directly within scikit-learn Pipelines.
Key Features:
Full Scikit-learn Compatibility: Inherits from BaseEstimator and TransformerMixin.
Optimized Performance: Automatically casts input DataFrames to bool to ensure compatibility with recent mlxtend updates and avoid DeprecationWarnings.
Error Handling: Gracefully handles empty frequent itemsets to prevent ValueError, ensuring pipeline stability even with low-support data.
Warning Suppression: Uses np.errstate to manage runtime warnings during metric calculations, ensuring a clean output for the user.
Changes:
Created mlxtend/frequent_patterns/pipeline.py with the RuleExtractor class.
Updated mlxtend/frequent_patterns/init.py to export the new class.
Added comprehensive unit tests in mlxtend/frequent_patterns/tests/test_pipeline.py.
CI Status Note:
I have verified that all new tests in mlxtend/frequent_patterns/tests/test_pipeline.py are passing locally (2 passed).
I noticed that the CI is failing on test_perceptron.py. I investigated this locally and confirmed it is due to a DeprecationWarning in mlxtend/classifier/perceptron.py:88 (errors += int(update != 0.0)) triggered by NumPy 2.0. Since these failures are unrelated to the frequent_patterns changes, the RuleExtractor implementation is functionally complete and verified.