🐞 Scrapy-based Crawlers for news of Taiwan including 10 media companies:
- 蘋果日報
- 中國時報
- 中央社
- 華視
- 東森新聞雲
- 自由時報
- 公視
- 三立
- TVBS
- UDN
$ git clone https://github.com/TaiwanStat/Taiwan-news-crawlers.git
$ cd Taiwan-news-crawlers
$ pip install -r requirements.txt
$ scrapy crawl apple -o apple_news.json
- Python3
- Scrapy 1.3.0
scrapy crawl <spider> -o <output_name>
- apple
- appleRealtime
- china
- cna
- cts
- ettoday
- liberty
- libertyRealtime
- pts
- setn
- tvbs
- udn
| Key | Value |
|---|---|
| website | the publisher |
| url | the origin web |
| title | the news title |
| content | the news content |
| category | the category of news |
The MIT License