Skip to content

ValueError for large weights  #6

@do-me

Description

@do-me

I am using a table with lat lon and weights with ~10.000 entries and I would like to create 10 clusters. The maximum weight is = 150.000. It contains few (~20) zero weights.

WKmeans always throws the same error:

ValueError: One or more clusters disappeared because all points rushed away to other cluster(s). Try increasing the stickiness parameter (beta).

Changing the beta or alpha doesn't change anything. Instead I noticed, when I divided my weights by let's say 1000 it worked again.

This led me to the function causing the error:

    def _has_converged(self):
        """Check if the items in clusters have stabilised between two runs.

        This checks to see if the distance between the centroids is lower than
        a fixed constant.
        """
        diff = 1000
        if self.clusters:
            for clu in self.clusters:
                # For each clusters, check the length. If zero, we have a
                # problem, we have lost clusters.
                if len(clu) is 0:
                    raise ValueError('One or more clusters disappeared because'
                                     ' all points rushed away to other'
                                     ' cluster(s). Try increasing the'
                                     ' stickiness parameter (beta).')

It seems that diff = 1000, is causing the issue as it's probably not suited to larger numbers or a large difference between min an max weight as in my case.

Could anyone recommend a dynamic value generation instead of the static value? Can't it just be like max weight / some factor?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions