name

search-concepts

summary

Couchbase Full-Text Search and Vector Search concepts — index prerequisites, query types, hybrid search, RYOW consistency, Flex Index, RAG pipeline, and pagination

description

Couchbase Full-Text Search and Vector Search concepts — index prerequisites, query types, hybrid search, RYOW consistency, Flex Index, RAG pipeline, and pagination

metadata

last_verified

min_server_version

handoff

2026-05

7.0

condition	type	skill
user asks for language-specific search code	variant	search-python

condition	skill
user asks about Couchbase fundamentals or core concepts	getting-started

condition	skill
user asks about SQL++ queries or the SEARCH() function	sqlpp-language

condition	skill
user asks about document structure or data modeling for search	server-data-modeling

Language routing: The language-specific handoff in the frontmatter above is a variant edge. If the user's language is known, route directly to the matching search-<lang> skill. If unknown, ask before routing.

Couchbase Search — Concepts

Couchbase provides two complementary search capabilities built on the same index and query infrastructure:

Full-Text Search (FTS) — text analysis, fuzzy matching, phrase search, geospatial
Vector Search — nearest-neighbour semantic search over embedding vectors

Both use the same index service (port 8094), the same SDK API (cluster.search() / cluster.searchQuery()), and can be combined in a single hybrid search query.

For index prerequisites, query types, hybrid search, RYOW consistency, Flex Index, and pagination, see shared/server/search-concepts.md.

RAG Pipeline

Retrieval-Augmented Generation (RAG) uses Couchbase vector search to retrieve relevant documents and pass them as context to a language model.

The 5-step pattern

1. CHUNK    — split source text into passages (e.g. 512 tokens with overlap)
2. EMBED    — generate a vector for each chunk using an embedding model
3. STORE    — upsert each chunk as a Couchbase document with an "embedding" field
4. RETRIEVE — at query time, embed the user's question and run a vector search
5. AUGMENT  — pass the top-K retrieved chunks as context to the LLM prompt

Store step (language-agnostic pseudocode)

for chunk in chunks:
    vector = embedding_model.encode(chunk.text)   # float[] of length = dims
    doc = {
        "text": chunk.text,
        "source": chunk.source,
        "embedding": vector
    }
    collection.upsert(f"chunk::{chunk.id}", doc)

Retrieve step (language-agnostic pseudocode)

query_vector = embedding_model.encode(user_question)
results = cluster.search(
    index_name,
    VectorSearch(VectorQuery("embedding", query_vector, num_candidates=10)),
    SearchOptions(fields=["text", "source"], limit=5)
)
context = [row.fields["text"] for row in results.rows()]

Key constraints

dims in the vector index definition must exactly match the embedding model output size (e.g. 1536 for text-embedding-ada-002, 768 for all-MiniLM-L6-v2).
Use templates/vector-index.json as the index definition template — set dims to match your model.
Vectors must be stored as a JSON array of floats, not base64.

Framework integrations

Couchbase has official integrations for both major RAG frameworks:

LangChain: langchain-couchbase package — CouchbaseVectorStore class handles store and retrieve steps. See https://python.langchain.com/docs/integrations/vectorstores/couchbase
LlamaIndex: llama-index-vector-stores-couchbase package — CouchbaseVectorStore. See https://docs.llamaindex.ai/en/stable/examples/vector_stores/CouchbaseVectorStoreDemo/

Both integrations handle index creation, document upsert, and similarity search. Use them when building production RAG pipelines rather than calling the SDK directly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couchbase Search — Concepts

RAG Pipeline

The 5-step pattern

Store step (language-agnostic pseudocode)

Retrieve step (language-agnostic pseudocode)

Key constraints

Framework integrations

FilesExpand file tree

SKILL.md

Latest commit

History

SKILL.md

File metadata and controls

Couchbase Search — Concepts

RAG Pipeline

The 5-step pattern

Store step (language-agnostic pseudocode)

Retrieve step (language-agnostic pseudocode)

Key constraints

Framework integrations