Skip to content

[BUG] AptaNetClassifier/Regressor.fit() mutates global NumPy random state via np.random.seed() #650

@onkar717

Description

@onkar717

Describe the bug

AptaNetClassifier.fit() and AptaNetRegressor.fit() call np.random.seed(self.random_state) when random_state is set. This mutates the global NumPy random state, which is an anti-pattern that causes unintended side effects on any user code using numpy.random in the same process.

The scikit-learn convention is to never set the global seed instead, RandomState instances should be passed to individual components. In this case, the RandomForestClassifier/RandomForestRegressor already receives random_state=self.random_state via _build_pipeline(), so the global np.random.seed() call is redundant for pipeline reproducibility.

To Reproduce

import numpy as np
from pyaptamer.aptanet import AptaNetClassifier

# User generates some random data
np.random.rand(5)
state_before = np.random.get_state()[1][:3].copy()

# Fitting the classifier resets global NumPy state!
clf = AptaNetClassifier(random_state=42, max_epochs=1, verbose=0)
X = np.random.rand(20, 5).astype(np.float32)
y = np.array([0]*10 + [1]*10, dtype=np.float32)
clf.fit(X, y)

# Global state was silently reset to seed=42
np.random.seed(42)
state_seed42 = np.random.get_state()[1][:3].copy()
state_after = np.random.get_state()[1][:3].copy()

# Before fix: state_after == state_seed42 (UNEXPECTED!)

Expected behavior

Calling fit() should not mutate the global NumPy random state. The random_state parameter should only affect reproducibility within the estimator itself, as per scikit-learn conventions.

Additional context

  • The RandomForest estimator already receives random_state via _build_pipeline(), making the np.random.seed() call redundant for that component.
  • torch.manual_seed() is kept because PyTorch has no local seed alternative this is standard practice in PyTorch-based sklearn estimators.
  • The bug exists in both AptaNetClassifier.fit() (line 122) and AptaNetRegressor.fit() (line 306).

Versions

Details
pyaptamer 0.1.0a1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions