Semantic Code Search Engine

A state-of-the-art, AI-powered semantic code search engine. It allows you to ingest entire GitHub repositories, chunk their source code using AST parsing, generate high-dimensional embeddings, and perform incredibly accurate natural-language search and Retrieval-Augmented Generation (RAG) over your codebase.

🌟 Features

Semantic & Hybrid Search: Combines Dense Vector Search (cosine similarity) with BM25 Keyword Search using Reciprocal Rank Fusion (RRF).
Retrieval-Augmented Generation (RAG): Ask conversational questions about your codebase. The AI synthesizes answers using retrieved chunks and provides exact file/line citations.
AST-Aware Chunking: Intelligently parses TypeScript, JavaScript, and Python into semantically meaningful chunks (functions, classes) using tree-sitter, ensuring embeddings capture full context.
Ultra-Premium UI: A stunning, "Linear-style" frontend with frosted glassmorphism, dot-grid depth, and streaming markdown formatting.

🛠️ Tech Stack

Runtime & Package Manager: Bun
Monorepo: Turborepo / Bun Workspaces
Frontend: Next.js (React), Tailwind CSS, Framer Motion, Lucide Icons, React Markdown
API: Hono (Edge-ready)
Vector Database: Qdrant
Cache & Rate Limiting: Redis
AI Models: Gemini (via OpenAI SDK compatibility layer) for embeddings and LLM generation.

📂 Project Structure

This is a monorepo structured as follows:

packages/api: The Hono API server handling search, RAG, and health checks.
packages/frontend: The Next.js web application.
packages/ingestion: The worker library for cloning, AST chunking, and embedding GitHub repositories into Qdrant.
packages/shared: Shared TypeScript types and core utilities (Observability, Caching).

🚀 Getting Started

Prerequisites

Bun installed locally
Docker & Docker Compose (for running Qdrant and Redis locally)

1. Install Dependencies

bun install

2. Start Infrastructure

Start the local Qdrant vector database and Redis cache:

docker-compose up -d

3. Environment Variables

Copy the example environment file and fill in your Gemini API key (or OpenAI key):

cp .env.example .env

Ensure you set OPENAI_API_KEY with your key from Google AI Studio.

4. Run the Development Servers

Start the frontend and API servers in parallel:

# Start the Next.js Frontend (Runs on port 3000)
npm run dev

# In a separate terminal, start the Hono API (Runs on port 3001)
bun run dev:api

🧠 How It Works

Ingestion: You submit a GitHub URL on the frontend. The API triggers the ingestion pipeline.
Chunking: The repository is cloned. Supported languages are parsed into an Abstract Syntax Tree (AST) to extract functions and classes. Other files use a sliding-window chunker.
Embedding: Chunks are embedded using text-embedding-004 (or your chosen model) and stored in Qdrant.
Retrieval: When you search, the query is embedded and searched against Qdrant. A parallel BM25 search is run. The results are merged via RRF.
Generation: If you ask a question, the top chunks are injected into a token-budgeted context window, and the LLM streams back a synthesized answer with precise file line citations.

🔒 CI/CD & Deployment

The repository uses GitHub Actions for continuous integration (Typechecking, Linting, Testing).
API is configured for deployment on Railway (via Dockerfile).
Frontend is configured for deployment on Vercel.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
packages		packages
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
bunfig.toml		bunfig.toml
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package.json		package.json
railway.json		railway.json
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Code Search Engine

🌟 Features

🛠️ Tech Stack

📂 Project Structure

🚀 Getting Started

Prerequisites

1. Install Dependencies

2. Start Infrastructure

3. Environment Variables

4. Run the Development Servers

🧠 How It Works

🔒 CI/CD & Deployment

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Semantic Code Search Engine

🌟 Features

🛠️ Tech Stack

📂 Project Structure

🚀 Getting Started

Prerequisites

1. Install Dependencies

2. Start Infrastructure

3. Environment Variables

4. Run the Development Servers

🧠 How It Works

🔒 CI/CD & Deployment

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages