Substack Articles Search Engine

A RAG application for searching articles and getting answers on relevant topics from your favorite Substack newsletters

📚 Table of Contents

Substack Articles Search Engine

🙂 Contributors

	Benito Martin \| AI / ML Engineer LinkedIn AI Echoes Newsletter
	Miguel Otero Pedrido \| AI / ML Engineer LinkedIn YouTube The Neural Maze Newsletter

🎯 Why Take This Course?

Unlike basic tutorials, this course provides a comprehensive, hands-on guide to building a complete end-to-end Retrieval-Augmented Generation (RAG) system using modern tools and best practices. You’ll see how to:

Automate data pipelines for ingesting and processing newsletter content
Integrate multiple cloud and open-source services (Supabase, Qdrant, Prefect, FastAPI)
Build a robust backend for keyword and LLM-powered search
Deploy and interact with your system using Google Cloud, a Gradio UI and REST API

👥 Who Is This Course For?

Audience	Why Join?
ML/AI Engineers	Build scalable RAG and LLM-powered search systems
Software Engineers	Learn modern backend, API, and cloud deployment skills
Data Engineers	Automate data pipelines and vector search workflows
AI Enthusiasts	Get hands-on with real-world, production-grade tools

🧑‍🎓 What You Will Learn

By the end of this course, you will have a fully functional RAG system and the skills to build production-ready applications to search over your favorite newsletters. You will:

Ingest articles from RSS feeds and store them in Supabase
Generate and index embeddings in Qdrant, including payload indexes for filtering with optimized index configuration with quantization and hybrid search
Orchestrate and schedule workflows with Prefect (local and cloud)
Build and expose RESTful search endpoints using FastAPI
Integrate multiple LLM providers (OpenRouter, OpenAI, Hugging Face)
Deploy your backend to Google Cloud Run for global access
Create an interactive Gradio UI for end-users

🎓 Prerequisites

Python (Intermediate)
Basic understanding of REST APIs
Familiarity with AI/LLM concepts is helpful
Modern laptop/PC (no GPU required; free tiers are sufficient)

💵 Does this course cost anything?

No, this course is completely free to access and learn from. Starring and sharing the repository is appreciated!
Google Cloud Run monthly free tier is sufficient for deployment
Prefect cloud monthly free tier is sufficient for orchestration once your flow is deployed but it is recommended to use prefect local server for development as it is unlimited.
Supabase and Qdrant monthly free tiers are sufficient for hosting the Postgres and vector databases
OpenRouter daily requests on free LLM models is sufficient for LLM calls but you can also use OpenAI or Hugging Face as backup LLM providers as the project supports multiple LLM providers.
Any other tools used in this course like FastAPI, Docker, Gradio, or Opik are completely free to use.

📚 Course Outline

Lesson	Topic	Substack Article	Description
1	Setup, Configuration & Articles Ingestion	Lesson 1	Supabase Postgres setup and ingesting articles
2	Vector Embeddings & Semantic Search Infrastructure	Lesson 2	Qdrant configuration and semantic search
3	FastAPI Backend & Multi-Provider LLM Support	Lesson 3	FastAPI backend, OpenRouter, OpenAI, Hugging Face
4	Cloud Run Deployment & Gradio UI	Lesson 4	Google Cloud Run deployment and Gradio UI
5	Video Application Overview	Lesson 5	Video demo showcasing the entire pipeline

🚀 Getting Started

Follow the INSTRUCTIONS.md in the documentation to set up your environment, install dependencies, and configure services.

All components are explained in detail in the documentation but if you have any questions, feel free to open an issue or reach out!

🔌 Services Providers

This project integrates several best-in-class open-source and cloud services to provide a scalable, production-ready RAG pipeline:

Service	Description	Docs/Links
Supabase	PostgreSQL database for articles	Supabase
Qdrant	Vector DB for embeddings	Qdrant
Prefect	Orchestration for ingestion/embedding	Prefect
OpenRouter	LLM Provider	OpenRouter
OpenAI, Hugging Face (backup)	LLM Provider (backup)	OpenAI / Hugging Face
Docker	Containerization	Docker
FastAPI	API for querying/search	FastAPI
Google Cloud SDK	Command-line interface for Google Cloud services	Google Cloud SDK
Gradio	UI	Gradio
Opik AI	LLM evaluation	Opik
Google Cloud Run	Deployment and hosting	Cloud Run

🪪 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
.vscode		.vscode
frontend		frontend
src		src
static		static
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.prefectignore		.prefectignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
INSTRUCTIONS.md		INSTRUCTIONS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cloudbuild_fastapi.yaml		cloudbuild_fastapi.yaml
deploy_fastapi.sh		deploy_fastapi.sh
prefect-cloud.yaml		prefect-cloud.yaml
prefect-local.yaml		prefect-local.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Substack Articles Search Engine

📚 Table of Contents

🙂 Contributors

🎯 Why Take This Course?

👥 Who Is This Course For?

🧑‍🎓 What You Will Learn

🎓 Prerequisites

💵 Does this course cost anything?

📚 Course Outline

🚀 Getting Started

🔌 Services Providers

🪪 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

benitomartin/substack-newsletters-search-course

Folders and files

Latest commit

History

Repository files navigation

Substack Articles Search Engine

📚 Table of Contents

🙂 Contributors

🎯 Why Take This Course?

👥 Who Is This Course For?

🧑‍🎓 What You Will Learn

🎓 Prerequisites

💵 Does this course cost anything?

📚 Course Outline

🚀 Getting Started

🔌 Services Providers

🪪 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages