By Parvez Khan
Welcome to Newspaper-Scrape β a project that digs into the New York Times Technology section and pulls out the good stuff. Using a mix of web scraping, natural language processing, sentiment analysis, and a little Python magic, this tool automatically:
- Extracts article text and metadata
- Summarizes content
- Analyzes polarity (positive/negative tone)
- Measures subjectivity (objective vs. opinionated writing)
Perfect if you want a quick pulse on whatβs happening in tech without wading through every single article.
- π§ Smart NLP: Summaries generated with
textblob - ππ
βΉοΈ Sentiment Analysis: Detects how positive, neutral, or negative an article feels - ποΈ Metadata Extraction: Grab titles, authors, and publication dates
- π΅οΈ Scraping Power: Runs through the New York Times Tech section using
newspaper3kandBeautifulSoup
To run this project locally, make sure youβve got Python 3+ and the following packages installed:
pip install textblob newspaper3k requests bs4No need to install time or random β those come built into Python.
- Clone this repo using GitHub Desktop or your favorite IDE (PyCharm works great).
- Run the script and let it pull articles directly from the NYT Tech section.
- View summaries, sentiment scores, and metadata in seconds.
Spotted a bug? Have an idea for a new feature? Drop an issue in the Issues tab β Iβll try to respond quickly.
Thanks for checking out Newspaper-Scrape! I hope this project saves you time and sparks ideas for your own experiments with web scraping + NLP.