Skip to content

[ENH] enable array API support in LogisticRegression#2941

Draft
avolkov-intel wants to merge 3 commits intouxlfoundation:mainfrom
avolkov-intel:dev/logreg-array-api
Draft

[ENH] enable array API support in LogisticRegression#2941
avolkov-intel wants to merge 3 commits intouxlfoundation:mainfrom
avolkov-intel:dev/logreg-array-api

Conversation

@avolkov-intel
Copy link
Contributor

Description

  • Add array API support for GPU LogisticRegression
  • Refactor LogisticRegression estimator, move all checks to sklearnex
  • Implement separate sklearnex SPMD estimator for LogisticRegression
Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
  • I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
  • I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

@avolkov-intel avolkov-intel added enhancement New feature or request Array API labels Feb 11, 2026
@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

❌ Patch coverage is 19.40299% with 54 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
sklearnex/linear_model/logistic_regression.py 10.41% 43 Missing ⚠️
onedal/linear_model/logistic_regression.py 42.10% 11 Missing ⚠️
Flag Coverage Δ
azure ?
github 82.04% <19.40%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
onedal/linear_model/logistic_regression.py 29.76% <42.10%> (+4.22%) ⬆️
sklearnex/linear_model/logistic_regression.py 49.43% <10.41%> (-9.31%) ⬇️

... and 30 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@supports_queue
def fit(self, X, y, queue=None):

# Is sparsity check here fine? - Same in BasicStatistics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer is: yes, provided that the sklearnex validation already happened before.

dtype=[np.float64, np.float32],
)
y_proba = self._onedal_predict_proba(X, queue)
xp, is_array_api_complient = get_namespace(X)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
xp, is_array_api_complient = get_namespace(X)
xp, is_array_api_compliant = get_namespace(X)

min_prob = 1e-15
max_prob = 1.0 - 1e-15

# TODO ARRAY API only branch check how this code can be adapted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "adapted"? This already works with array API, and there is a scikit-learn method that can work on CPU with the fitted coefficients from oneDAL when they are on CPU.

accept_large_sparse=_sparsity_enabled,
dtype=[np.float64, np.float32],
)
xp, is_array_api_complient = get_namespace(X)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
xp, is_array_api_complient = get_namespace(X)
xp, is_array_api_compliant = get_namespace(X)


def _onedal_predict(self, X, queue=None):
if queue is None or queue.sycl_device.is_cpu:
#TODO modify function to return array api compliant results???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if the input is not supported, it should be offloaded to scikit-learn, which will likely error out on its own. Either way, daal4py makes such a check for whether to offload to scikit-learn or not:


So it's safe to not modify the output here.


def _onedal_score(self, X, y, sample_weight=None, queue=None):
# TODO1 - OK
# is accuracy_score array api compatible? Seems it is functions from skelarn it's ARRAY API compatible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can check for compatibility in their documentation:
https://scikit-learn.org/stable/modules/array_api.html#metrics


# TODO do we need to support behavior when model fitted with sklearn
# (e.g. torch tensor or else and then this method is run)
# Currently it can't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible, because the oneDAL model object would only require coefficients and intercepts.


self._onedal_model = result.model

# For now it's fine to keep n_iteration as numpy variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the answer should be yes for scikit-learn compatibility, since they do not fully support array API for logistic regression at the moment. But that might change in the future.

# Is sparsity check here fine? - Same in BasicStatistics
is_csr = _is_csr(X)

# Is it good place? - Same in LinReg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but remember that this attribute also needs to be present in the sklearnex object, because sklearn has it.

dtype=[np.float64, np.float32],
)

xp, is_array_api_complient = get_namespace(X)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
xp, is_array_api_complient = get_namespace(X)
xp, is_array_api_compliant = get_namespace(X)

@david-cortes-intel
Copy link
Contributor

@avolkov-intel This will also require updates to the documentation for array API here:
https://uxlfoundation.github.io/scikit-learn-intelex/2025.10/array_api.html

Since LogisticRegression will be a special case that won't work like the rest and won't meet some of the specifications mentioned in that doc - for example:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Array API enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants