A Retrieval-Augmented Generation (RAG) application that processes PDF documents and answers questions using OpenAI's API.
- Docker installed
- OpenAI API key
For easier management, use the provided Makefile:
- 
Set your OpenAI API key: export OPENAI_API_KEY="your-openai-api-key-here" 
- 
Build and run: make quick-start 
- 
Or build separately: make build make run 
- 
Run interactively: make run-interactive 
If you want to run locally without Docker:
export OPENAI_API_KEY="your-openai-api-key-here"
go run main.goAfter making code changes:
make build
make runThe application uses the following environment variables:
- OPENAI_API_KEY: Your OpenAI API key (required)
- PDF text extraction using embedded PDF file
- Text chunking for optimal processing
- Vector embeddings using OpenAI's text-embedding-ada-002
- Semantic search with cosine similarity
- Question answering using GPT-4
- 
Missing API Key Error: Missing OPENAI_API_KEY env varSolution: Ensure your OpenAI API key is set as an environment variable. 
- 
PDF Text Corruption: If you see "heavily corrupted text" messages, the PDF extraction may have issues. Check the debug output for text quality. 
- 
Memory Issues: For large PDFs, you may need to increase Docker memory limits or optimize chunk sizes. 
The application includes debug output that shows:
- Extracted text samples
- Number of chunks created
- Retrieved chunks for each query
Run make help to see all available commands:
build                Build the Docker image
run                  Run the container with Docker
run-interactive      Run the container interactively
run-detached         Run the container in detached mode
dev                  Run the application locally (requires Go)
test                 Run tests
clean                Remove the Docker image
quick-start          Build and run the application quickly
PDF → Text Extraction → Chunking → Embeddings → Vector Store → Similarity Search → LLM → Answer
MIT License