This guide will get you up and running with rlm-rs in 5 minutes.
RLM-RS is a CLI tool that helps AI assistants (like Claude Code) process documents that are too large for their context window. It does this by:
- Chunking - Breaking large documents into smaller pieces
- Embedding - Creating semantic representations for search
- Searching - Finding relevant chunks using hybrid semantic + keyword search
- Passing by Reference - Letting AI agents retrieve specific chunks by ID
Think of it as a "smart pagination system" that lets AI assistants navigate huge documents efficiently.
- Rust 1.88+ (or use pre-built binaries)
- macOS, Linux, or Windows
cargo install rlm-clibrew tap zircote/tap
brew install rlm-rsgit clone https://github.com/zircote/rlm-rs.git
cd rlm-rs
make installrlm-cli --version
# Output: rlm-cli 1.2.4RLM-RS stores state in a local SQLite database:
rlm-cli initThis creates .rlm/rlm-state.db in your current directory.
Let's load a markdown file with automatic chunking and embedding:
rlm-cli load README.md --name readme --chunker semanticThis:
- Loads
README.mdinto a buffer named "readme" - Chunks it at natural boundaries (headings, paragraphs)
- Generates semantic embeddings automatically
rlm-cli statusOutput:
Database: .rlm/rlm-state.db (256 KB)
Buffers: 1
Total chunks: 47
Embedded chunks: 47 (100%)
Use hybrid search (semantic + BM25):
rlm-cli search "installation instructions" --buffer readme --top-k 3Output:
Chunk ID: 12 | Score: 0.89 | Buffer: readme
## Installation
### Via Cargo (Recommended)
cargo install rlm-cli
...
Once you know a chunk ID, you can retrieve it directly:
rlm-cli chunk get 12This is the "pass-by-reference" pattern - instead of copying text, you pass chunk IDs.
RLM-RS supports multiple chunking strategies:
| Strategy | Best For | How It Works |
|---|---|---|
semantic |
Markdown, documentation | Splits at headings and paragraphs |
code |
Source code | Splits at function/class boundaries |
fixed |
Plain text, logs | Fixed-size chunks with overlap |
parallel |
Large files (>10MB) | Multi-threaded fixed chunking |
rlm-cli load src/main.rs --name maincode --chunker codeThis splits Rust code at function boundaries, keeping functions intact.
rlm-cli load large-log.txt --chunker fixed --chunk-size 150000 --overlap 1000# Load multiple files
rlm-cli load src/lib.rs --name lib --chunker code
rlm-cli load src/main.rs --name main --chunker code
rlm-cli load tests/integration.rs --name tests --chunker code
# Search across all buffers
rlm-cli search "error handling" --top-k 10
# View specific chunks
rlm-cli chunk get 42# Load with semantic chunking
rlm-cli load whitepaper.md --name paper --chunker semantic
# Use regex search for specific terms
rlm-cli grep paper "performance|benchmark" --max-matches 20
# Peek at specific sections
rlm-cli peek paper --start 0 --end 5000# Dispatch chunks to multiple AI agents
rlm-cli dispatch --buffer paper --batch-size 5
# (After subagent analysis)
# Aggregate findings
rlm-cli aggregateRLM-RS is designed to work with the rlm-rs Claude Code plugin.
The RLM architecture:
- Root LLM: Main Claude conversation (Opus/Sonnet)
- Sub-LLM: Analyst agents (Haiku) via
rlm-subcall - External Environment:
rlm-cliCLI + SQLite
See Plugin Integration for details.
- Choose the Right Chunker: Semantic for prose, code for source files, fixed for logs
- Use Descriptive Buffer Names:
--name docsinstead of auto-generated names - Leverage Hybrid Search: Combines semantic understanding with keyword precision
- Pass by Reference: Share chunk IDs instead of copying large text blocks
- Clean Up: Use
rlm-cli delete <buffer>orrlm-cli resetto manage state
Now that you're up and running:
- Explore Commands: See the CLI Reference
- Try Examples: Work through Examples
- Customize Features: Learn about Feature Flags
- Get Help: Check Troubleshooting if you hit issues
- FAQ: Frequently Asked Questions
- Troubleshooting: Common Issues
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Next: Examples | CLI Reference | Troubleshooting