Skip to content

h2o.cut not separating by breaks correctly #8372

Open
@exalate-issue-sync

Description

@exalate-issue-sync

In the example below, I want to create a bin for each decile of the model prediction. I found that the first several deciles are grouped together in the bin: (0.0,0.001]

{code:R}
df <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/demos/bank-additional-full.csv")
xgboost <- h2o.xgboost(y = "y", training_frame = df, max_depth = 1, ntrees = 1)

preds <- h2o.predict(xgboost, df)
breaks <- h2o.quantile(preds$yes, probs = seq(0, 1, 0.1))
bins <- h2o.cut(preds$yes, breaks)

h2o.table(bins)
yes Count
1 (0.0,0.001] 14099
2 (0.001,0.005] 1840
3 (0.005,0.12] 2061
4 (0.12,0.965] 1999
{code}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions