GhostCrab MCP + mindBrain SQLite — structured domain navigation for llama_index sessions #21745
FrancoisLamotte
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What problem this solves
LlamaIndex is strong at indexing, retrieval, and workflow composition.
Its memory architecture, however, is structurally siloed by design.
Three isolated layers coexist without ever aligning: short-term
ChatMemoryBuffer(FIFO, evicted under token caps), long-term
MemoryBlocktypes(
FactExtractionMemoryBlock,VectorMemoryBlock,StaticMemoryBlock),and document
StorageContext. EachFunctionAgentandAgentWorkflowinstantiatesits own
Memory, defaulting to SQLite in-memory.AgentWorkflowshares runtimeContext— but no shared semantic memory across agents.A ResearchAgent and a WriterAgent in the same workflow operate on disjoint memory islands.
Findings discovered in one pipeline run are invisible to the next.
Context rebuilt from scratch every time.
This request adds two complementary components that solve this without touching LlamaIndex core.
What a domain looks like in practice
A domain is any bounded context where agents need to
reason over structured relationships : not just retrieve text.
mindBrain separates two levels:
Ontology (the model) : the schema of a domain:
entity types, relationship types, constraints, vocabularies.
Defined once, shared across every agent and every workflow run.
Knowledge Graph (the projected instance) : the populated graph:
real entities, real edges, real state : queryable via projections.
Example: multi-agent project delivery
The OrchestratorAgent queries a
pg_pragmaprojection:"what is the phase completion rate for implementation?" →
0.67."what tasks are blocked?" → returns the full dependency chain in a single call.
No LLM inference. No re-reading workflow transcripts.
Workers write structured facts; the orchestrator reads pre-computed projections.
Example: ERP domain
A FinanceAgent resolving a billing anomaly calls
query_contextwithfacets
{ component: "billing", role: "analyst" }and retrievesaccount structure, approval rules, and GL entry history across sessions.
No CSV re-injection. No LLM-extracted summaries that cost tokens every run.
Example: CRM domain
A SalesAgent building a renewal brief retrieves account tier,
deal stage, last interaction sentiment, and contractual constraints
via
query_contextwith faceted retrieval — not vector similarity.The distinction matters:
tier=enterprise AND stage=negotiationreturns a precise slice.A vector search returns ranked passages.
Example: Customer Support domain
A SupportAgent resolving TK-9821 calls
assert_factto link the ticketto the known issue, then
query_contextto retrieve resolution history —across sessions, without re-ingesting support documents every run.
The meta-ontology: where mindBrain becomes strategic
Each domain above is useful alone. The real leverage is connecting them.
A LlamaIndex multi-agent pipeline working on a cross-domain task —
say, an enterprise renewal involving a billing dispute,
an open critical support ticket, and a stalled deal —
can query all four domains from the same shared registry in one pass.
LlamaIndex continues to own document retrieval and RAG.
mindBrain owns operational context: decisions, task state, entity relationships,
phase progression, and cross-domain links.
That is the clean boundary: LlamaIndex retrieves documents.
MindBrain navigates the structured world those documents describe.
Two components, one integration
mindBrain (SQLITE : Personal edition)
The data layer. It organizes domain knowledge into three constructs:
depends_on,assigned_to,affects,linked_to,validated_by,contradicts, ...pg_pragma) : pre-computed views — phase completion rates, blocker queues, agent liveness, KG coverage — surfaced at zero inference costAgents don't run vector search as the default path.
They call structured queries: faceted retrieval, graph traversal, projection reads.
Vector search is available as fallback for unstructured content — it remains LlamaIndex's domain.
GhostCrab MCP
The gateway layer. An MCP sidecar that gives LlamaIndex agents the tools to:
GhostCrab is a protocol bridge. mindBrain exists and operates independently.
Why this belongs outside LlamaIndex core
It should stay external.
LlamaIndex already exposes the right seams:
BaseMemoryfor full replacement,BaseMemoryBlockfor additive adoption, andStorageContextwithgraph_storefor ingestion pipelines. MindBrain uses all three without modifying LlamaIndex itself.
Domain ontologies are specific to each application's context.
The framework cannot know in advance what is worth persisting,
how entities relate, or what projections an orchestrator needs.
That belongs to the workflow builder.
If GhostCrab is absent : LlamaIndex behaves exactly as before.
If mindBrain is empty : agents start with a blank namespace and populate it
through normal
assert_factcalls during the first workflow run.Three integration paths
Path 1 — Custom
BaseMemory(primary, shared registry)Drop-in replacement. All agents in the workflow share one namespace.
Path 2 — Custom
BaseMemoryBlock(additive, no disruption)Bolt GhostCrab on as an additional block with
priority=0— never evicted.Existing native memory blocks remain untouched.
Path 3 —
StorageContextgraph store (document ingestion path)For teams using
VectorStoreIndexwho want ontology relations resolvedalongside document retrieval.
→ Full configuration, skill files, and tested LlamaIndex walkthrough:
https://github.com/mindflight-orchestrator/ghostcrab-personal-mcp/blob/main/ghostcrab-integrations/llamaindex/SKILL_llamaindex_ghostcrab.md
https://github.com/mindflight-orchestrator/ghostcrab-personal-mcp/blob/main/ghostcrab-integrations/llamaindex/SKILL_ghostcrab_runtime.md
What changes in practice
FunctionAgentinstancesobserved,owns,depends_on,validated_by)Scope of this request
llama-index-core,BaseMemory,BaseMemoryBlock,StorageContextBaseMemorydrop-in + optionalMemoryBlock+StorageContextgraph storeBeta Was this translation helpful? Give feedback.
All reactions