Skip to content

VonMisesFisherMixture.log_likelihood() doesn't return a log likelihood value. What is it? #40

@mattroos

Description

@mattroos

I'd like to compute the AIC goodness of fit for a fitted model. This requires knowing the likelihood function value for the set of estimated vonMises-Fisher parameters. But what is being returned by VonMisesFisherMixture.log_likelihood()? It is an array of size (n_clusters, n_samples) and would appear to be probability values (in [0, 1]) that a given sample belongs to a given cluster. They are not log-likelihood values (since those would all be < 0). From this array, what is the correct way to compute the likelihood value needed for computing AIC? I think it is something like below, but could be wrong since I'm not yet certain of what's being returned by log_likelihood().

likelihood = vmf_soft.log_likelihood(x) # shape (n_clusters, n_samples)
log_likelihood = np.sum(np.log(np.max(likelihood, axis=0))) # Choose the cluster of highest probability, convert that probability to log-likelihood, and sum across all samples.

This is based on equation 3.2 of the 2005 paper. The cluster/class weights may need to be involved too. I'm not sure if they're already incorporated into the values returned by log_likelihood().

@jasonlaska, maybe you can help?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions