From Book To Scrape, get product information of all products pages and save it into a CSV file using Requests, Beautiful Soup, CSV and RE libraries.
Understand the logic with the flowchart below:
-
Get a list of
all Categories Links -
For each
Category Link- Parse Links Products Pages
- IF there is a
"Next"page then go to this page and parse Links Products Pages
-
Create a CSV file
-
For each
Product Page- Parse Products Information
- Insert product information in the CSV
- Save the image of the book
You will need to install Requests and BeautifulSoup libraries.
Install Python libraries before to clone the repo:
- Requests
pip install requests
- Beautiful Soup
pip install bs4
- Clone the repo
git clone https://github.com/Jliezed/oc_project_2_BookToScrape.git
- Go to your project directory
cd /oc_project_2_BookToScrape - Install venv library (if not yet in your computer)
pip install venv
- Create a virtual environment
python -m venv env
- Activate the virtual environment
source env/bin/activate
- Install the packages using requirements.txt
pip install -r requirements.txt
- Run the script using the terminal
python main.py
You will get a separate CSV file by category including for each product page :
- product_page_url
- universal_ product_code (upc)
- title
- price_including_tax
- price_excluding_tax
- number_available
- product_description
- category
- review_rating
- image_url
It will also save product image for each product page.

