Currently, there are a fixed number of clusters of embeddings identified per partition. We want to:
- Have the number of clusters be dynamic (use GATE's PCA method to determine the number of clusters)
- Come up with one-sentence summaries for each cluster, for interpretability. We can probably use an LLM for this.
Currently, there are a fixed number of clusters of embeddings identified per partition. We want to: