Skip to content

Latest commit

ย 

History

History
154 lines (118 loc) ยท 9.65 KB

File metadata and controls

154 lines (118 loc) ยท 9.65 KB
alt text

Easy-vecDB (โš ๏ธ Alpha Internal Test)

Caution

โš ๏ธ Alpha version warning: This is an early internal build. It is not yet complete and may contain bugs. Issues and suggestions are very welcome via GitHub Issues.

GitHub stars GitHub forks GitHub issues GitHub license

ไธญๆ–‡ | English

๐Ÿ“š Online Documentation

๐Ÿ“š A Hands-on Tutorial on Vector Database Principles and Practices โ€” From Zero to Production

๐Ÿงญ Project Overview

EasyVecDB is a systematic learning project on vector databases designed for developers and researchers.
It covers the full lifecycle from foundational concepts and algorithmic principles to production-level application deployment, focusing on three main directions:

  • ๐Ÿงฉ Theory Fundamentals: Understand the principles, architecture, and indexing mechanisms of vector databases
  • โš™๏ธ Hands-on Practice: Master the usage and optimization of Milvus / Faiss / Annoy
  • ๐Ÿ’ก Project Cases: Build complete projects from scratch, including RAG systems, embedding-based retrieval, and clustering visualization

๐Ÿ“– Content Navigation

The project is divided into Fundamentals and Practice sections, corresponding to the navigation structure below:

Section Key Content Status
Part I: Fundamentals (Base) Vector DB principles, embeddings, and search basics
Chapter 1 Project Introduction Project goals and learning roadmap โœ…
Chapter 2 Why Vector Databases Retrieval bottlenecks and similarity search โœ…
Chapter 3 Vector Embedding Basics Word2Vec, Transformer embeddings โœ…
Chapter 4 Vector Search Basics Brute-force search and similarity metrics โœ…
Chapter 5 ANN Search Algorithms IVF, PQ, HNSW, LSH, Annoy principles & code โœ…
Chapter 6 Build Your Own Vector Database Minimal vector DB implementation โœ…
Part II: Annoy Tutorial Lightweight ANN search library
Chapter 1 Annoy Introduction & Setup Installation and core concepts โœ…
Chapter 2 Annoy Core API Index building, querying, parameter tuning โœ…
Chapter 3 Annoy Advanced Tips & Best Practices Performance optimization, engineering practices โœ…
Part III: Faiss Tutorial High-performance vector search engine
Chapter 1 FAISS Introduction & Setup Installation and core concepts โœ…
Chapter 2 FAISS Core Indexes Flat, IVF, PQ, HNSW indexes โœ…
Chapter 3 Advanced FAISS Features Composite indexes, GPU, batch search โœ…
Chapter 4 FAISS Performance Tuning Recall, latency, memory optimization โœ…
Chapter 5 FAISS Engineering Practices Service deployment and real-world cases โœ…
Part IV: Milvus Tutorial Distributed vector database & engineering
Chapter 1 Milvus Introduction: Concepts & Architecture Architecture and core components โœ…
Chapter 2 Milvus Core Concepts Collection, Partition, Index โœ…
Chapter 3 Milvus Basic Operations Data ingestion, query, index management โœ…
Chapter 4 Milvus AI Applications: Hybrid Search with BM25 RAG and hybrid retrieval โœ…
Chapter 5 Milvus AI Applications: Image Retrieval Image retrieval system โœ…
Chapter 6 Milvus Advanced Topics Internal architecture, reranker, Milvus Lite, MinerU โœ…
Part V: AI Applications Based on Vector Databases
Project 1 Recommendation Recall with Annoy DSSM + Annoy vector recall โœ…
Project 2 RAG with FAISS RAG using FAISS โœ…
Project 3 Agent with Milvus Agent system using Milvus โœ…
Project 4 RAG with Milvus & ArangoDB Hybrid RAG system โœ…
Part VI: Supplementary Topics Related advanced topics
Vector Fundamentals Vector math and basics โœ…
FusionANNS Architecture GPU-accelerated retrieval โœ…
Meta-Chunking Strategy Intelligent text chunking โœ…
Theoretical Limits of Retrieval Performance boundaries โœ…
RabitQ Indexing High-dimensional quantization โœ…
Clustering Algorithms Clustering overview โœ…

โณ Continuously updating...

๐Ÿ“˜ This project aims to help you master vector databases from principles โ†’ practice โ†’ deployment.

๐Ÿ› ๏ธ Project Structure

.
โ”œโ”€โ”€ docs Vector database tutorials and documentation
โ”œโ”€โ”€ data Common example datasets
โ”œโ”€โ”€ src Project-related source code
โ””โ”€โ”€ tmp Temporary files

๐Ÿ“„ Additional Resources

Related Competition

๐Ÿค Contributing

  • If you find any issues, feel free to open an Issue. If there is no response, you may contact the Support Team.
  • If youโ€™d like to contribute, submit a Pull Request. If there is no response, you may also contact the Support Team.
  • If you are interested in starting a new Datawhale project, please follow the Datawhale Open Source Project Guide.

Core Contributors

Special Thanks

  • Thanks to @Sm1les for the support and help
  • Thanks to all contributors who made this project possible โค๏ธ

Follow Us

Scan the QR code below to follow the Datawhale official account

๐Ÿ“Š Star History

Star History Chart

๐Ÿ“œ License

Creative Commons License

This work is licensed under the
Creative Commons Attributionโ€“NonCommercialโ€“ShareAlike 4.0 International License.

Made with โค๏ธ by Datawhale