An AI-powered web scraper application that leverages free LLM providers to perform intelligent web scraping based on user prompts. The system allows users to input scraping instructions via a user-friendly UI, process web data using LLM APIs, and output results in professional formats.
- 🌐 Intelligent Web Scraping: Uses AI to understand and extract data based on natural language prompts
- 🎨 Modern UI: Responsive React frontend with TypeScript and Tailwind CSS
- ⚡ Fast Backend: High-performance FastAPI backend with async support
- 📄 Multiple Output Formats: Generate results in Word, PDF, Excel, or text formats
- 🔐 Secure: API keys stored securely in environment variables
- 🧪 Well-Tested: Comprehensive test suite with high coverage
- 📚 Well-Documented: Complete guides for developers, testers, and users
- Frontend: Vite + React + TypeScript + Tailwind CSS
- Backend: FastAPI + Python
- LLM Integration: OpenRouter/OpenAI APIs
- Output Generation: python-docx, reportlab, openpyxl
- Testing: Pytest (backend), Vitest (frontend)
- Node.js 18+ and npm
- Python 3.8+
- Git
-
Clone the repository
git clone <repository-url> cd ai-webscraper
-
Backend Setup
cd backend python -m venv venv venv\Scripts\activate # On Windows pip install -r requirements.txt # Copy .env.example to .env and add your API keys copy .env.example .env
-
Frontend Setup
cd frontend npm install -
Run the Application
Terminal 1 (Backend):
cd backend uvicorn app.main:app --reloadTerminal 2 (Frontend):
cd frontend npm run dev -
Access the Application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
ai-webscraper/
├── backend/ # FastAPI backend
├── frontend/ # Vite React frontend
├── docs/ # Documentation
└── README.md # This file
MIT License