A web scraper that extracts posts from reddit before storing the title and description of each post into a CSV file.
Great tool to use if you need Reddit's data in CSV format on your next project!
This scraper was built using JavaScript, a Node.js starter kit, and Puppeteer.
Starter kit can be found here: https://github.com/MLH/mlh-hackathon-nodejs-starter
This project requires the following tools:
- Node.js - The JavaScript environment for server-side code.
- NPM - A Node.js package manager used to install dependencies.
- PostgreSQL - A relational database system.
To get started, install NPM and Postgres on your local computer if you don't have them already. A simple way for Mac OS X users to install Postgres is using Postgres.app. Here is a Windows guide for installing PostgresSQL.
Next, run the following command to install the dependencies:
$ npm install
Run the following command:
$ node app/controllers/index.js
You can find the post.csv file generated in the app folder.