Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
main.go	main.go

Long-Term Memory Example (RAG + Qdrant)

An interactive terminal chatbot that demonstrates semantic (RAG) memory for multi-turn conversations.

This example shows:

REPL-style interaction - Chat naturally in a loop
Semantic memory - Retrieves relevant past snippets based on your current message
Vector store - Memory is persisted in Qdrant
Memory commands - Recall snippets, clear memory, show configuration

What you'll learn

How to use memory/rag to recall relevant past messages
How to back semantic memory with vectorstore/qdrant
How to implement a chat loop with persistent state
How to manage semantic memory (recall, clear)
How memory affects agent responses

Run

This example uses an embedding model to store/retrieve memory.

You also need a Qdrant instance reachable via gRPC (default port 6334). For local dev:

docker run --rm -p 6333:6333 -p 6334:6334 qdrant/qdrant

Option A: Local Ollama (OpenAI-compatible)

export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_MODEL=gpt-oss:20b-cloud
# OPENAI_API_KEY can be empty for Ollama

go run ./examples/long-term-memory

Option B: OpenAI

export OPENAI_API_KEY=...your_key...
export OPENAI_MODEL=gpt-4o-mini

go run ./examples/long-term-memory

Flags

go run ./examples/long-term-memory -- \
  -topk 5 \
  -qdrant-host localhost \
  -qdrant-port 6334 \
  -qdrant-collection long_term_memory

Usage

Chat normally

Just type your message and press Enter:

> What's your name?

I'm a conversational assistant. You can call me Assistant. How can I help you today?

> Remember that my favorite color is blue.

Got it! I'll remember that your favorite color is blue.

> What's my favorite color?

Your favorite color is blue!

With RAG memory, the agent recalls relevant past snippets based on what you ask.

Commands

Type commands starting with / to manage memory:

/history <query> - Retrieve relevant past snippets (semantic recall)
/clear - Clear all conversation memory (fresh start)
/stats - Show memory configuration
/help - Show available commands
/exit - Exit the program

Example session

> Tell me a joke about programming

Why do programmers prefer dark mode? Because light attracts bugs! 🐛

> /stats

📊 Memory Statistics:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: semantic (RAG)
TopK per turn: 5
Store: qdrant (localhost:6334 tls=false collection=long_term_memory)
Vector size: 1536

> /history dark mode bugs

📜 Retrieved Memory (topk=5):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  1. ⚙️ system: Context retrieved from memory:
     user: Tell me a joke about programming
     assistant: Why do programmers prefer dark mode? Because light attracts bugs! 🐛

> /clear

✨ Cleared semantic memory.

How it works

Memory setup

This example wires an embedding model + vector store into a rag.RAG, then wraps it as an agent memory.Memory:

qc, _ := qdrantapi.NewClient(&qdrantapi.Config{Host: "localhost", Port: 6334})

// Vector size must match the embedder output size (this example infers it at runtime).
store, _ := vsqdrant.New(qc, "long_term_memory", vsqdrant.WithVectorSize(1536))
ragEngine, _ := rag.New(store, embedder, rag.WithTopK(5))
conversationMemory := ragmemory.New(ragEngine)
a.SetMemory(conversationMemory)

Agent automatically uses memory

When you call a.Run(ctx, userInput):

The agent retrieves relevant memory snippets (semantic search)
Adds the new user message
Calls the LLM with full conversation context
Stores the response back in memory
Returns the response to you

Memory is automatic - you don't need to manually manage retrieval.

A note about "history"

RAG memory is not a chronological transcript. /history <query> performs semantic retrieval, so it returns whatever past snippets are most relevant to the query.

Memory management strategies

Tuning tips

Increase -topk if the agent often misses relevant earlier context.
Keep -topk modest to avoid flooding the prompt with irrelevant memory.

When to clear memory

Starting a new topic
User requests a fresh start
Agent seems confused by old context
Testing/debugging

Persistence

This example persists embeddings and payloads to Qdrant, so memory survives process restarts.

Customization ideas

Extend this example to learn more:

Add save/load commands - Persist conversations to JSON files
Topic detection - Auto-clear memory when topic changes
More tools - Weather, calculator, search, etc.
Colored output - Use ANSI colors for better readability
Streaming responses - Show partial responses as they're generated

Comparison to other examples

Example	Memory	Tools	Pattern
Simple Agent	❌ No	✅ Calculator	Single-turn
Conversational Agent	✅ Yes	✅ Time	Multi-turn REPL
Long-Term Memory	✅ Yes (semantic)	✅ Time	Multi-turn REPL + semantic recall
RAG Chatbot	✅ Yes	✅ RAG search	Multi-turn + retrieval
Multi-Agent	❌ No	✅ Go commands	Sequential agents

This example focuses specifically on memory and conversational state. It's the foundation for building chatbots, assistants, and interactive tools.

Next steps

RAG Chatbot - Add document retrieval to conversations
Supervisor Blackboard - Shared memory between multiple agents
Skills - Add reusable capabilities to conversational agents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Long-Term Memory Example (RAG + Qdrant)

What you'll learn

Run

Option A: Local Ollama (OpenAI-compatible)

Option B: OpenAI

Flags

Usage

Chat normally

Commands

Example session

How it works

Memory setup

Agent automatically uses memory

A note about "history"

Memory management strategies

Tuning tips

When to clear memory

Persistence

Customization ideas

Comparison to other examples

Next steps

FilesExpand file tree

long-term-memory

Directory actions

More options

Directory actions

More options

Latest commit

History

long-term-memory

Folders and files

parent directory

README.md

Long-Term Memory Example (RAG + Qdrant)

What you'll learn

Run

Option A: Local Ollama (OpenAI-compatible)

Option B: OpenAI

Flags

Usage

Chat normally

Commands

Example session

How it works

Memory setup

Agent automatically uses memory

A note about "history"

Memory management strategies

Tuning tips

When to clear memory

Persistence

Customization ideas

Comparison to other examples

Next steps