Skip to content

feat: add examples for scraping e commerce #214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ecommerce-scraper/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
OPENAI_API_KEY="your_openai_api_key"
SCRAPEGRAPH_API_KEY="your_scrapegraph_api_key"
68 changes: 68 additions & 0 deletions ecommerce-scraper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# E-commerce Scraper

A Python-based web scraping tool built with CrewAI and ScrapegraphAI ([Scrapegraph](https://scrapegraph.ai/)) that extracts product information from e-commerce websites. Currently configured to scrape keyboard listings from eBay Italy.

## Features

- Automated web scraping using CrewAI agents
- Integration with Scrapegraph for reliable data extraction
- Configurable for different product searches
- Environment-based configuration for API keys

## Prerequisites

- Python 3.8 or higher
- OpenAI API key
- Scrapegraph API key

## Installation

1. Clone the repository:
```bash
git clone <repository-url>
cd ecommerce-scraper
```

2. Install the required dependencies:
```bash
pip install crewai crewai-tools python-dotenv
```

3. Set up environment variables:
- Copy `.env.example` to `.env`
- Add your API keys to the `.env` file:
```plaintext
OPENAI_API_KEY="your_openai_api_key"
SCRAPEGRAPH_API_KEY="your_scrapegraph_api_key"
```

## Usage

Run the scraper:
```bash
python ecommerce_scraper.py
```

The script will:
1. Connect to eBay Italy
2. Search for keyboards
3. Extract product information
4. Output the results

## Customization

To scrape different products or websites, modify the `website` variable in `ecommerce_scraper.py`:

```python
website = "https://www.ebay.it/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=your_search_term&_sacat=0"
```

Replace `your_search_term` with the product you want to search for.

## License

[Add your chosen license here]

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
30 changes: 30 additions & 0 deletions ecommerce-scraper/ecommerce_scraper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from crewai import Agent, Crew, Process, Task

from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

load_dotenv()

website = "https://www.ebay.it/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=keyboard&_sacat=0"
tool = ScrapegraphScrapeTool()

agent = Agent(
role="Web Researcher",
goal="Research and extract accurate information from websites",
backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
tools=[tool],
)

task = Task(
name="scraping task",
description=f"Visit the website {website} and extract detailed information about all the keyboards available.",
expected_output="A file with the informations extracted from the website.",
agent=agent,
)

crew = Crew(
agents=[agent],
tasks=[task],
)

crew.kickoff()