Discrepancies between ONNX and sklearn probabilities with isotonic CalibratedClassifierCV

Hello, and thank you for your work on this great library!
I'm seeing a pretty big difference in probabilities when using `CalibratedClassifierCV` with isotonic regression together with `RandomForestClassifier`.
It seems like it's only happening when the `max_depth` parameter is set high enough.

I've provided a small snippet to reproduce the issue, with the following versions of libraries:

* `scikit-learn==1.6.0`
* `skl2onnx==1.18.0`
* `onnxruntime==1.20.1`


```python
import numpy as np
import onnxruntime as ort
from numpy.testing import assert_almost_equal
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from sklearn.calibration import CalibratedClassifierCV
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier

X, y = make_classification(
    n_samples=400_000,
    n_features=15,
    n_informative=15,
    n_redundant=0,
    n_classes=2,
    n_clusters_per_class=2,
    random_state=30,
)
X = X.astype(np.float32)

rf = RandomForestClassifier(
    max_depth=10,
    n_jobs=-1,
    random_state=1234,
).fit(X, y)

model = CalibratedClassifierCV(rf, method="isotonic", cv="prefit").fit(
    X, y
)

model_onnx = convert_sklearn(
    model,
    initial_types=[("input", FloatTensorType([None, X.shape[1]]))],
    target_opset=15,
    options={"zipmap": False},
)

session = ort.InferenceSession(model_onnx.SerializeToString())

output = session.run(
    ["probabilities"],
    {"input": X},
)
onnx_probs = output[0][:,1]
model_probs = model.predict_proba(X)[:,1].astype(np.float32)

assert_almost_equal(onnx_probs, model_probs, decimal=5)
```

The result is:

```bash
> Mismatched elements: 4485 / 400000 (1.12%)
Max absolute difference among violations: 0.01261032
Max relative difference among violations: 0.11618411
```

I see that IsotonicRegression is not supported on https://onnx.ai/sklearn-onnx/supported.html but I would think CalibratedClassifierCV with both methods would be supported.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancies between ONNX and sklearn probabilities with isotonic CalibratedClassifierCV #1151

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discrepancies between ONNX and sklearn probabilities with isotonic CalibratedClassifierCV #1151

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions