v0.5.0 (April 2018)
Enhancements:
Plotting and transforming text data
hyp.plotnow supports plotting text data. Simply pass a string, list of strings or list of lists of strings and the text will be transformed using a semantic model and plotted. By default, the text will be fit to a topic model (LDA) fit to a selection of wikipedia pages.- A new
vectorizerargument inhyp.plotto specify a text vectorizer. Currently supportsCountVectorizer,TfidfVectorizer`, or class instances (fit or unfit) of these models. - A new
semanticargument inhyp.plotthat specifies the semantic model to use to transform text. Current supportsLatentDirichletAllocation,NMF, or class instances (fit or unfit) of these models. - A new
corpusargument inhyp.plotthat allows the user to specify text to fit a semantic model. Can be 'wiki', 'nips', 'sotus' or a custom list of text. - Enhanced
hyp.format_datafunction that takes data in various forms (numpy array, dataframe, str, or list of str, or mixed list) and returns them in a standard format (a list of numpy arrays). This function can be used to transform text data using a semantic model.
New algorithms
- A new clustering algorithm HDBSCAN (thanks @lmcinnes!) e.g.
hyp.plot(data, cluster='HDBSCAN') - A new dimensionality reduction algorithm UMAP (thanks @lmcinnes!) e.g.
hyp.plot(data, reduce='UMAP')
New parameters
- A new
sizeparam to resize figure e.g.hyp.plot(data, size=[10,8]) - A new
axparam to add figure to existing axis e.g.hyp.plot(data, ax=ax)
New text examples
- A new dataset of NIPS papers e.g.
hyp.load('nips')(from kaggle) - A new dataset of selected wikipedia pages e.g.
hyp.load('wiki') - A new dataset of State of the Union text from 1989-2017. Can be loaded as
hyp.load('sotus')(from kaggle)
API changes
In hyp.plot changed group arg to hue (group will still be supported but depreciated in a coming release).
- Removed deprecated
describe_pcafunction. Please use more general function,describe.
Bugs fixed
- When using
chemtrailsinhyp.plot, the entire timeseries would appear for the first few seconds of an animation and then dissapear. - The legend colors did not align with the data when using the
fmtorcolorargs. - When grouping with group/hue arg, labels were not reshuffled.
- Fixed bug in describe function where correlations between data and reduced data would asymptote < 1.
NOTE: If you have been using the development version of 0.5.0, please clear your
data cache (/Users/yourusername/hypertools_data).