GPU Price Tracker

A web scraping project that collects graphics card data from e-commerce websites and displays it in a user-friendly interface.

Features

Scrapes GPU product information from Amazon
Extracts product name, price, user rating, memory size, GPU model, and specifications
Saves data to CSV files for easy analysis
Includes a simple HTML frontend to view the collected data

Project Structure

webScraper/
├── gpu_scraper/            # Scrapy project directory
│   ├── spiders/            # Spider implementations
│   │   └── amazon_gpu.py   # Amazon GPU spider
│   ├── items.py            # Item definitions
│   ├── middlewares.py      # Spider and downloader middlewares
│   ├── pipelines.py        # Item pipelines
│   └── settings.py         # Project settings
├── output/                 # Output directory for scraped data
│   └── amazon_gpu_prices.csv
├── index.html              # Frontend HTML interface
└── scrapy.cfg              # Scrapy configuration file

Installation

Clone this repository:

git clone https://github.com/yourusername/gpu-price-tracker.git
cd gpu-price-tracker

Create and activate a virtual environment:

python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Install Playwright browsers (required for JavaScript rendering):
```
playwright install
```

Usage

Running the Scraper

To run the Amazon GPU spider:

cd webScraper
scrapy crawl amazon_search_product

The scraped data will be saved to output/amazon_gpu_prices.csv.

Viewing the Data

To view the scraped data in the web interface:

Start a local web server:
```
python -m http.server 8000
```
Open a web browser and navigate to:
```
http://localhost:8000/index.html
```

Configuration

You can modify the scraper settings in gpu_scraper/settings.py:

USER_AGENT: Browser user agent string
DOWNLOAD_DELAY: Delay between requests (in seconds)
CONCURRENT_REQUESTS_PER_DOMAIN: Maximum concurrent requests per domain

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Scrapy - The web scraping framework used
Playwright - Used for JavaScript rendering

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
gpu_scraper		gpu_scraper
.gitignore		.gitignore
README.md		README.md
index.html		index.html
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU Price Tracker

Features

Project Structure

Installation

Usage

Running the Scraper

Viewing the Data

Configuration

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU Price Tracker

Features

Project Structure

Installation

Usage

Running the Scraper

Viewing the Data

Configuration

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages