Skip to content

[BUG] Non deterministic make_classification. #6510

@trivialfis

Description

@trivialfis

Describe the bug
Multiple calls to the make_classification produce different result even with the same random_state.

Steps/Code to reproduce bug

def test_cuml_gen() -> None:
    import cupy as cp
    from cuml.datasets import make_classification

    n_samples_per_batch = 8192
    n_features = 400
    rs = n_samples_per_batch * n_features * 4
    X0, y0 = make_classification(
        n_samples_per_batch,
        n_features,
        n_redundant=0,
        n_repeated=0,
        n_informative=n_features,
        random_state=rs,
    )
    X1, y1 = make_classification(
        n_samples_per_batch,
        n_features,
        n_redundant=0,
        n_repeated=0,
        n_informative=n_features,
        random_state=rs,
    )

    cp.testing.assert_allclose(X0, X1)
    cp.testing.assert_allclose(y0, y1)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Not equal to tolerance rtol=1e-07, atol=0
E
E           Mismatched elements: 1535867 / 3276800 (46.9%)
E           Max absolute difference: 2.000002
E           Max relative difference: 1118481.1
E            x: array([[  5.839701,  -4.92806 , -18.028366, ..., -16.778223,   3.611579,
E                    -6.924231],
E                  [ 17.475462, -13.639202,  -0.215977, ...,   5.726135,  -4.760585,...
E            y: array([[  7.839701,  -4.92806 , -18.028366, ..., -16.778223,   3.611579,
E                    -6.924231],
E                  [ 19.475462, -13.639202,  -0.215977, ...,   5.726135,  -4.760585,...

Expected behavior
Given the same random_state, it should produce the same result.

Environment details (please complete the following information):

  • Environment location: Bare-metal
  • Linux Distro/Architecture: [Ubuntu 22.04 amd64]
  • GPU Model/Driver: NVIDIA RTX A3000 Laptop GPU and driver 561.03
  • CUDA: 12.6
  • Method of cuDF & cuML install: conda
>>> import cupy
>>> cupy.__version__
'13.4.1'
>>> import cuml
cuml.__ver>>> cuml.__version__
'25.02.01'

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions