Description
K-Means finds some centroids $\mathbf{\mu}_1,\ldots,\mathbf{\mu}_K$ and assigns data points to a cluster as $c = {\rm argmin}_k \lbrace \Vert\boldsymbol{x} - \mathbf{\mu}_k\Vert^2\rbrace$, or in code:
c = torch.argmin(torch.cdist(x, centroids)**2)
Neither the assignment $c$ nor the distance $\Vert\boldsymbol{x} - \mathbf{\mu}_c\Vert^2$ are really explainable.
The assignment $c$ is not continuous, and the distance is in fact measuring a dissimilarity to the cluster, essentially the opposite of what we want to explain.
Fixes
From Clustering to Cluster Explanation via Neural Networks (2022) present a method based on neuralization. They show that the k-means cluster assignment can be exactly rewritten as a set of piecewise linear functions
$$f_c(\boldsymbol{x}) = \min_{k\neq c}\lbrace \boldsymbol{w}_{ck}^\top\boldsymbol{x} + b_{ck}\rbrace$$
and the original k-means cluster assignment can be recovered as $c = {\rm argmax}_k\lbrace f_k(\boldsymbol{x})\rbrace$.
The cluster discriminant $f_c$ is a measure of cluster membership (instead of a dissimilarity) and is also structurally amenable to LRP and friends. It can also be plugged on top of a neural network feature extractor to make deep cluster assignments explainable.
Additional Information
- There is no pairwise distance layer in PyTorch. Since zennit canonizers apply to
torch.nn.Module modules, such module would need to be added
- Each discriminant $f_c$ has it's own weight matrix $W\in\mathbb{R}^{(K-1)\times D}$. In principle, a neuralized k-means layer could be written as a layer
torch.nn.Sequential(
torch.nn.Linear(in_features=D, out_features=K*(K-1)),
torch.nn.Unflatten(1, torch.Size([K, K-1]))
)
or alternatively via torch.einsum, which I think is easier. Either way, a special layer is needed for that.
- The paper suggests a specific LRP rule for the $\min_{k\neq c}\lbrace s_k\rbrace$ to make the explanations continuous. In practice, we can replace it by $\beta^{-1}{\rm LogMeanExp}_k\lbrace \beta\cdot s_k\rbrace$ which approximates the min for $\beta < 0$, but has continuous gradients. Gradient propagation through this layer recovers exactly the LRP rule from the paper, but this also provides explanation continuity for other gradient-based explanation methods. I would suggest to implement such a
LogMeanExpPool layer. Alternative could be to implement a MinPool layer with a custom LRP rule.
- A canonizer could be used to replace the distance/k-means layer by it's neuralized counterpart. Alternatively, a composite might also work, but it seems a bit more complicated (where are the weights computed? how to set LRP rules for the linear/einsum layer?)
Description
K-Means finds some centroids$\mathbf{\mu}_1,\ldots,\mathbf{\mu}_K$ and assigns data points to a cluster as $c = {\rm argmin}_k \lbrace \Vert\boldsymbol{x} - \mathbf{\mu}_k\Vert^2\rbrace$ , or in code:
Neither the assignment$c$ nor the distance $\Vert\boldsymbol{x} - \mathbf{\mu}_c\Vert^2$ are really explainable.$c$ is not continuous, and the distance is in fact measuring a dissimilarity to the cluster, essentially the opposite of what we want to explain.
The assignment
Fixes
From Clustering to Cluster Explanation via Neural Networks (2022) present a method based on neuralization. They show that the k-means cluster assignment can be exactly rewritten as a set of piecewise linear functions
and the original k-means cluster assignment can be recovered as$c = {\rm argmax}_k\lbrace f_k(\boldsymbol{x})\rbrace$ .
The cluster discriminant$f_c$ is a measure of cluster membership (instead of a dissimilarity) and is also structurally amenable to LRP and friends. It can also be plugged on top of a neural network feature extractor to make deep cluster assignments explainable.
Additional Information
torch.nn.Modulemodules, such module would need to be addedor alternatively via
torch.einsum, which I think is easier. Either way, a special layer is needed for that.LogMeanExpPoollayer. Alternative could be to implement aMinPoollayer with a custom LRP rule.