Skip to content

rkeisler/tsne_guardian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

t-SNE Guardian

This little project shows a t-SNE visualization of articles from The Guardian published in 2014.

What I've done here

  • Used the Guardian API to grab the title and thumbnail of all articles published in 2014.
  • Used the spacy NLP code to extract nouns from titles and trailing text. Each article is now a "bag-of-nouns".
  • Calculated cosine distance between articles in the bag-of-noun space.
  • Used the scikit-learn t-SNE implementation to embed the articles in a 2d space, based on those cosine distances.
  • Made a big jpg image showing the thumbnails for the articles in this 2d space.
  • Hacked some leaflet/javascript for browser visualization.

Results

About

code for visualizing articles from The Guardian

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published