[Plugin Idea] Company Knowledge Base — RAG-powered context layer for agents #486

MatB57 · 2026-03-10T12:36:28Z

MatB57
Mar 10, 2026

🧠 Context & origin

Started as a simple UX ask (document uploads in issues), escalated fast in Discord. Given the plugin system is already on the roadmap — and that a KB is literally cited as the canonical plugin example — this felt worth formalizing properly.

This is a full proposal, not just a feature request. The goal is to think through the whole thing end-to-end so whoever picks it up (or we build it together) has a solid foundation.

🔥 The core problem

Right now, every agent in Paperclip starts each task essentially blind. It has access to the codebase and the task description, but nothing else. It doesn't know:

How your team writes code — naming conventions, preferred patterns, libraries to use or avoid
Why things were built the way they were — past architectural decisions, tradeoffs, debt
What your product actually is — domain vocabulary, business rules, user personas
How your infrastructure works — deployment patterns, environment configs, internal services
What's happened before — past incidents, resolved bugs, decisions made in previous tasks
What not to do — known anti-patterns, security constraints, compliance rules

This forces you to either:

Write a massive system prompt that you manually maintain and repeat everywhere
Accept that agents will make decisions with incomplete context and you'll spend time correcting them

A knowledge base solves this at the platform level, once, for all agents.

📦 What the KB stores — full breakdown

1. 📄 Technical documentation

Architecture diagrams and explanations (uploaded as images + markdown)
API contracts and OpenAPI specs
Database schemas and ERDs
Internal library documentation
Infrastructure maps (services, ports, dependencies)
Environment configs and .env structure explanations (not secrets, but what each var is for)

2. 🧑‍💻 Code conventions & standards

Style guides (naming, formatting, structure)
Preferred patterns for common tasks (auth, error handling, logging, API calls)
Libraries to use and libraries to avoid — and why
Folder structure conventions
Commit message format, PR templates
Testing standards (unit vs integration, coverage expectations)

3. 🏗️ Architectural decisions (ADRs)

What was decided and when
What alternatives were considered
Why the chosen approach was selected
Known limitations and future migration plans
This is huge — agents won't refactor things that were intentionally designed a certain way

4. 🧩 Product & domain knowledge

What the product does, for whom, and why
Domain-specific vocabulary and terminology
Business rules and logic that aren't obvious from the code
User personas and use case descriptions
Competitive context if relevant

5. 📋 Processes & runbooks

Deployment procedures
Incident response steps
How to run tests locally
Onboarding checklist
Review checklist before merging a PR
Database migration procedures

6. 🔐 Constraints & compliance rules

Security requirements (what data can be logged, what can't)
PII handling rules
Rate limits and usage caps on external services
License constraints on dependencies
GDPR/compliance requirements if applicable

7. 🕰️ Historical context & task memory

Summaries of completed tasks and what decisions were made
Past bug reports and their root causes + fixes
Failed approaches — things that were tried and didn't work
Post-mortems
This is the "institutional memory" layer — the stuff that's usually only in people's heads

8. 🔗 External references

Synced external documentation (third-party APIs, SDKs)
URLs to relevant blog posts, RFCs, standards
GitHub issues or PRs from dependencies that affect the project
Stack Overflow answers saved for reference

9. 📁 Files & attachments

PDFs (architecture docs, contract specs, compliance docs)
DOCX (internal documentation, SOPs)
Images (diagrams, wireframes, screenshots of UI states)
CSVs (sample data structures, config templates)
Markdown files synced from the repo itself (READMEs, CONTRIBUTINGs, etc.)

🏗️ Proposed architecture

┌─────────────────────────────────────────────────────────────┐
│                    KB Plugin                                │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  INGESTION LAYER                                    │   │
│  │  - Manual upload (drag & drop in UI)                │   │
│  │  - URL sync (crawl & re-sync on schedule)           │   │
│  │  - Git watcher (sync /docs from repo automatically) │   │
│  │  - Google Drive connector                           │   │
│  │  - Notion connector                                 │   │
│  │  - Paste raw text / markdown directly               │   │
│  └────────────────────────┬────────────────────────────┘   │
│                           │                                 │
│  ┌────────────────────────▼────────────────────────────┐   │
│  │  PROCESSING LAYER                                   │   │
│  │  - File parsing (PDF, DOCX, MD, TXT, HTML, images) │   │
│  │  - Smart chunking (by headers, then fixed-size)     │   │
│  │  - Embedding generation (local or API)              │   │
│  │  - Metadata tagging (source, date, category, tags)  │   │
│  └────────────────────────┬────────────────────────────┘   │
│                           │                                 │
│  ┌────────────────────────▼────────────────────────────┐   │
│  │  STORAGE LAYER (on existing Postgres instance)      │   │
│  │  - pgvector for embeddings (HNSW index)             │   │
│  │  - Full text (BM25/tsvector) for keyword search     │   │
│  │  - Object store for raw files                       │   │
│  │  - Company-scoped (strict isolation per company)    │   │
│  └────────────────────────┬────────────────────────────┘   │
│                           │                                 │
│  ┌────────────────────────▼────────────────────────────┐   │
│  │  RETRIEVAL LAYER                                    │   │
│  │  - Hybrid search (vector + BM25 keyword fallback)   │   │
│  │  - Re-ranking by relevance score                    │   │
│  │  - Result deduplication                             │   │
│  └────────────────────────┬────────────────────────────┘   │
│                           │                                 │
│  ┌────────────────────────▼────────────────────────────┐   │
│  │  AGENT INTERFACE                                    │   │
│  │  - Auto-inject at task start (top-k relevant chunks)│   │
│  │  - kb.search("query") tool for mid-task lookup      │   │
│  │  - kb.save("key", "content") to write back          │   │
│  │  - Context budget management (token limit aware)    │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

🔌 Why pgvector is the right call here

Paperclip already runs an embedded Postgres instance per deployment. Adding pgvector as an extension is a single line — no new infrastructure, no new service to manage, no external SaaS dependency.

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE kb_chunks (
  id          BIGSERIAL PRIMARY KEY,
  company_id  UUID NOT NULL,         -- company isolation, non-negotiable
  source_id   UUID NOT NULL,         -- which document this came from
  content     TEXT NOT NULL,         -- the actual text chunk
  embedding   VECTOR(1536),          -- embedding (dim depends on model)
  tsv         TSVECTOR,              -- for BM25 keyword search
  metadata    JSONB,                 -- source URL, date, tags, category...
  created_at  TIMESTAMPTZ DEFAULT now()
);

-- HNSW index for fast approximate nearest-neighbor search
CREATE INDEX kb_chunks_embedding_idx 
  ON kb_chunks USING hnsw (embedding vector_cosine_ops);

-- Full-text search index
CREATE INDEX kb_chunks_tsv_idx 
  ON kb_chunks USING gin (tsv);

Hybrid search query (vector + keyword):

-- Semantic search + keyword boost, company-scoped
SELECT 
  id, content, metadata,
  (embedding <=> $query_embedding) * 0.7 +
  (1 - ts_rank(tsv, query)) * 0.3 AS score
FROM kb_chunks
WHERE company_id = $company_id
ORDER BY score ASC
LIMIT 10;

No Pinecone, no Qdrant, no Weaviate. Just the Postgres you already have.

🤖 How agents interact with the KB

Auto-injection at task start

When a task is created and an agent picks it up, the runtime automatically:

Embeds the task title + description
Runs a KB search with that embedding
Injects the top-k most relevant chunks into the agent's context window as a <knowledge_base> block
Respects the context budget — if there are 5000 tokens of relevant KB content but the model has 2000 tokens of context budget for KB, it ranks and truncates

Mid-task explicit lookup

The agent can call the KB as a tool at any point during execution:

kb.search("how do we handle authentication in this project")
→ returns: [{ content: "...", source: "docs/auth.md", relevance: 0.92 }, ...]

Writing back to the KB

Agents can propose KB additions — decisions they made that should be remembered:

kb.save({
  title: "Decision: use Zod for all API input validation",
  content: "...",
  category: "architectural-decisions",
  tags: ["validation", "api", "zod"]
})

These go into a pending review queue visible in the UI — a human approves before they're added to the main KB. No agent writes to the KB without human confirmation.

🎨 UI considerations

KB management page (`/settings/knowledge-base`)

Sources list — all ingested sources, their status, last sync date, chunk count
Upload — drag and drop any file type
Add URL — paste a URL, set sync frequency (manual / daily / weekly)
Add text — paste raw markdown or text directly
Pending agent submissions — queue of KB entries proposed by agents, approve/reject
Search — ability to search the KB yourself to verify what agents will find
Categories & tags — organize content, control what gets injected for which task types

In-issue attachment support (the original ask)

Any file attached to an issue gets auto-ingested into the KB
Tagged with the issue ID so the agent working that issue gets it auto-injected
Also available for future retrieval by other agents if relevant

❓ Open questions

Embedding model — local (nomic-embed-text, all-MiniLM-L6 via Ollama) for full privacy, or API-based (OpenAI text-embedding-3-small, Cohere) for better quality? Should be configurable.
Context budget — how many tokens of KB context should be auto-injected per task? Should agents be able to request more?
Scoping — company-wide KB only, or also per-project KBs? Per-agent private KB?
Sync strategy — for Git repo sync, should we watch the /docs folder only or let users configure which paths to watch?
Agent write access — should agents be able to mark a document as "outdated" or "superseded" in addition to proposing new entries?
Version history — should KB entries be versioned so you can see how a document evolved over time?
Access control — should certain KB entries be restricted to specific agents or roles?

🛣️ Suggested build order

Phase 1 — Basic storage + manual upload
- pgvector setup on existing Postgres
- File upload UI (PDF, MD, TXT, DOCX)
- Manual embedding + indexing
- Basic vector search in agent context
Phase 2 — Hybrid search + auto-injection
- BM25 keyword index alongside vector
- Hybrid scoring
- Automatic injection at task start with budget management
Phase 3 — Sync sources
- URL sync with configurable frequency
- Git repo /docs folder watcher
- Google Drive connector
Phase 4 — Agent write-back
- kb.search() and kb.save() tools exposed to agents
- Pending review queue in UI
- KB entry versioning

@aaaaron already in 🙋 — happy to co-build. Who else?

MatB57 · 2026-03-10T12:37:02Z

MatB57
Mar 10, 2026
Author

Schems of what I imagine :

0 replies

MatB57 · 2026-03-10T13:03:35Z

MatB57
Mar 10, 2026
Author

🎯 Revised proposal: KB Plugin v1 — Keep it dead simple

After the feedback in Discord (and general agreement in the thread): the full RAG spec above is the right end-state, but completely the wrong v1. Got carried away designing the dream instead of the thing that actually ships.

What v1 is

A structured content store that agents can read from. Nothing more.

Data model:

Categories — top-level buckets (e.g. "Code Conventions", "Runbooks", "Product")
Subcategories — optional grouping inside a category
Articles — a title + markdown body. That's it.

Agent access:

At task start: agent receives a list of category names + article titles → can request specific ones
Mid-task tool: kb.get("category/article-title") → returns the markdown content
Free-text search: simple keyword match against titles + content (no embeddings, no vectors)

UI (plugin settings page):

Create / edit / delete categories and subcategories
Write articles in a markdown editor
Toggle which categories are "always inject" vs "on-demand"

What v1 is NOT

No RAG, no embeddings, no pgvector
No file uploads, no URL sync, no git watcher
No agent write-back
No connectors (Drive, Notion, etc.)

All of that belongs in v2+ once the basic structure proves useful.

Why this still delivers real value

Even without RAG, a KB plugin immediately solves the core problem: agents start tasks with zero company context.

With v1, you can write "always inject" articles like:

Code Conventions / General → your style guide
Infrastructure / Services → what services exist and what they do
Product / Glossary → domain terms agents need to know

That's already a massive improvement over the current blank slate, and it's trivially simple to build. Think Notion without the AI, as a plugin — flexible enough for every use case, not just dev-heavy setups.

Prior art worth studying: Hindsight for OpenClaw

@mingfang flagged this in Discord and it's relevant. Hindsight (@vectorize-io/hindsight-openclaw) is a production-grade open-source memory plugin for OpenClaw. Worth understanding what it does before designing our own thing.

How it works:

A local hindsight-embed daemon bundles PostgreSQL + a memory API into a single process — no extra infra
Auto-capture: every conversation is stored after each turn, facts/entities/relationships extracted in background by a cheap LLM (gpt-4o-mini or haiku)
Auto-recall: relevant memories injected into context before each agent response — no tool call needed, no model decision
Feedback loop prevention: strips its own <hindsight_memories> tags before re-storing, so injected memories don't get re-extracted as new facts (prevents exponential growth)
Memory isolation per agent + channel + user, fully configurable
External API mode for shared memory across instances/team

Their key insight: "memory that works automatically is qualitatively different from memory that depends on model behavior." The critique of OpenClaw's native memory — agent has to decide what to save and when to search — maps exactly to the problem we're trying to solve here.

But Hindsight and the KB plugin solve different problems:

	Hindsight	KB Plugin
What it stores	Episodic agent memory — what happened in past conversations	Semantic company knowledge — what the company knows
Who writes it	Automated extraction from conversations	Humans, manually curated
Scope	Per-user, per-channel, per-session	Company-wide, persistent, intentional
Use case	"What did this user tell me last week?"	"How do we handle auth in this codebase?"

They're complementary, not competing. Hindsight = agent's episodic memory. KB plugin = company's institutional memory.

That said, the Hindsight architecture (local daemon reusing existing Postgres, auto-inject before each turn, feedback loop prevention) is a solid reference for when we eventually build the v2 RAG layer. Worth studying before reinventing.

Suggested build order

Data model: categories → subcategories → articles, strictly company_id-scoped
CRUD UI in plugin settings
"Always inject" flag per category — auto-prepend to agent context at task start
kb.get() tool exposed to agents for on-demand lookup
Basic keyword search across titles + content

RAG/pgvector + Hindsight-style auto-extraction as v2 once we see what people actually put in there.

0 replies

Nyrok · 2026-03-10T13:16:50Z

Nyrok
Mar 10, 2026

Good writeup. The "agent starts blind" problem has two layers: missing retrieval context (what the KB solves) and ambiguous task instructions (what the prompt itself needs to fix).

Even with solid RAG, if the task description is flat prose the agent still has to infer role, constraints, and output expectations. Separating those into named blocks makes the KB context slot cleanly into a context block while the task, constraints, and format stay separate and stable across runs.

I built flompt (https://flompt.dev) for exactly this, a visual prompt builder that decomposes prompts into semantic XML blocks. Pairs well with a KB layer. Open-source: github.com/Nyrok/flompt

0 replies

ttomiczek · 2026-03-22T12:33:22Z

ttomiczek
Mar 22, 2026

Given how BAD vector databases for RAG work in the normal implementation in OpenClaw, maybe a knowledbe base based approach is better? SOmething that decomposes documents first then combines both normal search as well as vector search. There are multiple better memroy implementations for OpenClaw agents and they all are not using the pure RAG approach. Vector i.e. would get totally confused on a normal large code formatting document - it would need that decomposed into small files that each have different vectors.

not against an agent having access to a KB - just saying the traditional approach may be as unusable as anything.

I also would object

"Always inject" flag per category — auto-prepend to agent context at task start

THIS would ONLY work for MULTIPLE SEPARATE KB's or if a KB can be marked as secondary. I.e. something tha MUST inject for a programmer / implementation near agent may not need to be injected at all (or even accessible) for a market researcher.

0 replies

Adel-Magebinary · 2026-04-22T22:25:35Z

Adel-Magebinary
Apr 22, 2026

Maybe this will be useful? https://www.youtube.com/watch?v=sboNwYmH3AY&t=14s

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Plugin Idea] Company Knowledge Base — RAG-powered context layer for agents #486

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Plugin Idea] Company Knowledge Base — RAG-powered context layer for agents #486

Uh oh!

Uh oh!

MatB57 Mar 10, 2026

🧠 Context & origin

🔥 The core problem

📦 What the KB stores — full breakdown

1. 📄 Technical documentation

2. 🧑‍💻 Code conventions & standards

3. 🏗️ Architectural decisions (ADRs)

4. 🧩 Product & domain knowledge

5. 📋 Processes & runbooks

6. 🔐 Constraints & compliance rules

7. 🕰️ Historical context & task memory

8. 🔗 External references

9. 📁 Files & attachments

🏗️ Proposed architecture

🔌 Why pgvector is the right call here

🤖 How agents interact with the KB

Auto-injection at task start

Mid-task explicit lookup

Writing back to the KB

🎨 UI considerations

KB management page (/settings/knowledge-base)

In-issue attachment support (the original ask)

❓ Open questions

🛣️ Suggested build order

Replies: 5 comments

Uh oh!

MatB57 Mar 10, 2026 Author

Uh oh!

MatB57 Mar 10, 2026 Author

🎯 Revised proposal: KB Plugin v1 — Keep it dead simple

What v1 is

What v1 is NOT

Why this still delivers real value

Prior art worth studying: Hindsight for OpenClaw

Suggested build order

Uh oh!

Nyrok Mar 10, 2026

Uh oh!

ttomiczek Mar 22, 2026

Uh oh!

Adel-Magebinary Apr 22, 2026

MatB57
Mar 10, 2026

KB management page (`/settings/knowledge-base`)

MatB57
Mar 10, 2026
Author

MatB57
Mar 10, 2026
Author

Nyrok
Mar 10, 2026

ttomiczek
Mar 22, 2026

Adel-Magebinary
Apr 22, 2026