Skip to content

Releases: MaartenGr/BERTopic

Topic Probability Distribution

29 Oct 13:23
bdfebd5
Compare
Choose a tag to compare
  • transform() and fit_transform() now also return the topic probability distributions
  • Added visualize_distribution() which visualizes the topic probability distribution for a single document

Small patch release

17 Oct 06:43
a61a768
Compare
Choose a tag to compare
  • Fixed n_gram_range not being used
  • Added option for using stopwords

Small patch release

11 Oct 13:37
dd9582e
Compare
Choose a tag to compare

Improved the calculation of the class-based TF-IDF procedure by limiting the calculation to sparse matrices. This prevents out-of-memory problems when faced with large datasets.

Fixed missing mapped_topics

01 Oct 09:09
Compare
Choose a tag to compare

When transforming new documents, self.mapped_topics seemed to be missing. Added to the init.

Fixed requirements

24 Sep 12:58
Compare
Choose a tag to compare
  • Fixed requirements --> Issue with pytorch
  • Update docs
  • Update readme

First Release

24 Sep 12:17
Compare
Choose a tag to compare
  • Added parameters for UMAP and HDBSCAN
  • Option to choose sentence-transformer model
  • Method for transforming unseen documents
  • Save and load trained models (UMAP and HDBSCAN)
  • Extract topics and their sizes
  • Optimized c-TF-IDF
  • Improved documentation
  • Improved topic reduction