Skip to content

hist function reports incorrect bins #8306

Open
@exalate-issue-sync

Description

@exalate-issue-sync

hist() in the python api (tested on 3.24.0.1) seems to show incorrect breaks.

For example, you can see that sepal_len has a min value of 4.3 but the first break point which marks the first bin is 5.02

{code:python}>>> iris = h2o.import_file("http://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv")

iris.describe()
iris[0].hist(breaks=5,plot=False)
======================================================
sepal_len sepal_wid petal_len petal_wid class
type real real real real enum
mins 4.3 2.0 1.0 0.1
mean 5.8433 3.0539 3.7586666 1.1986666
maxs 7.9 4.4 6.9 2.5
======================================================
breaks counts mids_true mids density
5.02 nan nan nan nan
5.74 32 2.15 5.38 0.296296
6.46 41 2.55 6.1 0.37963
7.18 42 2.9 6.82 0.388889
7.9 35 3.25 7.54 0.324074
{code}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions