A backend system that recommends LeetCode problems based on YouTube DSA tutorial content using semantic similarity matching.
Vid-Quiz-Gen extracts transcripts from YouTube DSA tutorials, processes them using AI-powered summarization, and matches them against a database of LeetCode problems using vector embeddings to suggest relevant practice problems.
- Transcript Extraction: Scrapes YouTube video captions using Colly
- AI Summarization: Summarizes transcripts using Gemini 2.0 Flash
- Semantic Search: Generates 768D embeddings with Gemini API and queries using pgvector
- Problem Matching: Recommends top LeetCode problems based on video content
- gRPC Service: Python microservice for scraping LeetCode problem statements
- Backend: Golang, Python
- Database: PostgreSQL with pgvector extension
- AI/ML: Gemini API (text-embedding-004, gemini-2.0-flash)
- Communication: gRPC, Protocol Buffers
- Web Scraping: Colly (Go), leetscrape (Python)
- Go 1.21+
- Python 3.9+
- PostgreSQL with pgvector extension
- Gemini API key
- Clone the repository
git clone https://github.com/Sayan-995/vidquizgen.git
cd vidquizgen- Install Go dependencies
go mod download- Install Python dependencies
cd lcscrape
pip install -r requirements.txt- Set environment variables
export GEMINI_API_KEY=your_api_key
export DATABASE_URL=your_postgres_connection_string- Run the services
# Start Python gRPC server
cd lcscrape && python lcscrape.py
# Start Go backend (in another terminal)
go run main.goGenerate LeetCode problem recommendations from a YouTube video.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID"
}Response:
{
"questions": [
{
"title": "Two Sum",
"url": "https://leetcode.com/problems/two-sum"
}
]
}MIT