OC - PROJECT N°2 - BOOK TO SCRAPE

From Book To Scrape, get product information of all products pages and save it into a CSV file using Requests, Beautiful Soup, CSV and RE libraries.

By Emile Perron

About The Project

Understand the logic with the flowchart below:

Get a list of all Categories Links
For each Category Link
- Parse Links Products Pages
- IF there is a "Next" page then go to this page and parse Links Products Pages
Create a CSV file
For each Product Page
- Parse Products Information
- Insert product information in the CSV
- Save the image of the book

(back to top)

Built With

(back to top)

Getting Started

You will need to install Requests and BeautifulSoup libraries.

Prerequisites

Install Python libraries before to clone the repo:

Requests
```
pip install requests
```
Beautiful Soup
```
pip install bs4
```

Installation & Running the script

Clone the repo

git clone https://github.com/Jliezed/oc_project_2_BookToScrape.git

Create and activate a virtual environment

Go to your project directory
```
cd /oc_project_2_BookToScrape
```
Install venv library (if not yet in your computer)
```
pip install venv
```
Create a virtual environment
```
python -m venv env
```
Activate the virtual environment
```
source env/bin/activate
```

Install the packages using requirements.txt
```
pip install -r requirements.txt
```
Run the script using the terminal
```
python main.py
```

(back to top)

Outputs

You will get a separate CSV file by category including for each product page :

product_page_url
universal_ product_code (upc)
title
price_including_tax
price_excluding_tax
number_available
product_description
category
review_rating
image_url

It will also save product image for each product page.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
image		image
.gitignore		.gitignore
README.md		README.md
functions.py		functions.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OC - PROJECT N°2 - BOOK TO SCRAPE

About The Project

Built With

Getting Started

Prerequisites

Installation & Running the script

Create and activate a virtual environment

Outputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OC - PROJECT N°2 - BOOK TO SCRAPE

About The Project

Built With

Getting Started

Prerequisites

Installation & Running the script

Create and activate a virtual environment

Outputs

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages