Skip to content

A reddit scraper created for the Buy From EU project. Scrapes products from the subreddit based on search terms, distill the data using an LLM and format.

Notifications You must be signed in to change notification settings

perelloliver/BuyFromEU-Reddit-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

buyfromEU_reddit_scraper

A reddit scraper created for the Buy From EU project. Scrapes products from the subreddit based on search terms, distill the data using an LLM and format. This has been written with the aim of accessibility for users of all skillsets. Feel free to reach out if you have any questions.

Required:

  • Mistral API key, paid tier
  • Reddit developer project credentials
  • A reddit account

How to:

  • If you wish to run this in your own environment, download the notebook and use it locally.
  • Otherwise, use the colab notebook here.

Approximate cost: Under 5EUR in LLM costs.

Todo: Data formatting needs improvement. For example, list strings and misformatted items. No item in the european column should appear in the american column. Etc..

Notes:

  • If you are running to re-collect data as more posts are shared, adjust the timeframe in the code to avoid scraping and processing duplicate information.

About

A reddit scraper created for the Buy From EU project. Scrapes products from the subreddit based on search terms, distill the data using an LLM and format.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published