A tool for scraping and generating Q&A pairs for RNN training using GPT-4.
- Python 3.9 or higher
- Poetry (package manager)
- OpenAI API Key
-
Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 - -
Install dependencies:
poetry install
-
Set up environment variables:
- Copy
.env.exampleto.env - Add your OpenAI API Key to the
.envfile
- Copy
Start the scraper in the Poetry environment:
poetry run python scrape.\*.pyGenerated Q&A pairs will be saved to `data/*.jsonl