This project scrapes a food recipe web page and summarizes its content using your configured LLM. It is specialized to handle cooking recipes and provides chef-level, easy-to-understand summaries. If the provided website is not a food recipe, the service will inform you it cannot summarize non-recipe content.
- A production-ready FastAPI service with Basic Authentication
- A reusable summarization module that summarizes recipes with a "multicuisine chef" persona
- A Jupyter notebook for experimentation
api_service.py: FastAPI app exposing POST/v1/summarize.website_summarizer.py: Orchestrates recipe scraping and calls the LLM.scraper.py: Extracts website text (Selenium/BS4 based).practice.ipynb: Sample notebook usage.pyproject.toml: Dependencies managed byuv.
- Python 3.12+
uv(recommended) orpip- Chrome/Chromedriver available for Selenium (if using Selenium paths)
- Install dependencies:
uv sync- Create a
.envin the project root and set keys:
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
API_USERNAME=your_user
API_PASSWORD=your_passNotes:
- You can use OpenAI or Gemini via the existing code paths. Ensure keys are set accordingly.
uv run uvicorn api_service:app --host 0.0.0.0 --port 8000Health check:
curl -s http://localhost:8000/healthSummarize a recipe (POST JSON, Basic Auth):
curl -X POST "http://localhost:8000/v1/summarize" \
-H "Authorization: Basic $(printf '%s' 'your_user:your_pass' | base64)" \
-H "Content-Type: application/json" \
-d '{"url":"https://www.food.com/recipe/creamy-garlic-penne-pasta-43023"}'Response:
{
"url": "https://www.food.com/recipe/creamy-garlic-penne-pasta-43023",
"summary": "..."
}- Launch Jupyter with the same environment:
uv run jupyter notebook- In the notebook, avoid
print(display(Markdown(...))); usedisplay(Markdown(...))or returnMarkdown(...)as the last cell expression.
- If
curlappears to hang:- Ensure server is running and reachable:
lsof -i :8000 - Try IPv4 loopback explicitly:
curl -v http://127.0.0.1:8000/health - Kill previous processes on the same port:
kill -9 $(lsof -t -i :8000)
- Ensure server is running and reachable:
- Selenium timeouts: ensure Chrome is installed and accessible; adjust waits in
scraper.py.
- Basic Auth is required when
API_USERNAME/API_PASSWORDare set. - Tighten CORS and logging in production; use a process manager (systemd, Docker, or similar).
MIT (or your preferred license).