Skip to content

Sort-means appears to have a bug #14

@ghamerly

Description

@ghamerly

Sort-means is giving slightly different results in some of my testing, compared to all the other algorithms. I have a test case where every other algorithm converges in 45 iterations, but it takes 42, and gives slightly different results (centers).

Steps to reproduce:

  • run naive k-means and sort-means on the attached input, with k=10, using the library's k-means++ initialization, with a max iterations of 1000
  • expected result: same number of iterations and centers
  • actual result: different centers, different iterations

The initial centers chosen by k-means++ on this dataset are:

    -0.379863       1.80735      0.658205      0.236417       1.52649 
    -0.456454      -0.44091      0.932621        1.4306      -0.77737 
      1.10611      -1.30305     -0.202388      -1.29549      0.150969 
     0.951702      0.885737    -0.0140388     -0.295967      -2.10619 
      2.52547      0.803819       1.57157      0.481766      0.168094 
     -0.35617      0.423922       0.72442     -0.557642      0.466705 
     0.619142      -1.08521      -1.85976      -1.19759     -0.763672 
    -0.929235      0.901898     -0.717973      -2.60481     0.0617718 
    -0.391341      0.258408     -0.368092       0.53643      -1.35026 
     0.242555       -1.5091   -0.00501431       0.30317         1.244 

bad_input.txt.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions