A modular component for Graph-based Retrieval Augmented Generation (GraphRAG) using Neo4j graph database and MinIO object storage.
- Graph-based Retrieval: Leverage graph structures for enhanced RAG with entity and relationship awareness
- Neo4j Integration: Efficient connection pooling with configurable connection management
- MinIO Storage: Store graph embeddings, community reports, and intermediate results in object storage
- Complete CRUD Operations: Create, Read, Update, Delete for nodes and relationships
- Advanced Search Methods:
- Community/Global Search
- Integrated Hybrid Search (Global + Parallel Retrieval + RRF + Expansion + Reranking)
- Clustering Algorithms: Louvain, Leiden, Hierarchical clustering for community detection
- RESTful API: FastAPI endpoints with OpenAPI/Swagger documentation
- Modular Architecture: Can be enabled/disabled via configuration
- Type-safe: Pydantic models for automatic validation
- Error Handling: Comprehensive error handling with appropriate HTTP status codes
The Knowledge Graph module is part of Tiledesk LLM's modular architecture. To enable it:
Edit service_conf.yaml:
services:
graphrag: true # Enable Knowledge Graph module
# Required dependencies configuration
minio:
endpoint: "localhost:9000"
access_key: "minioadmin"
secret_key: "minioadmin"
secure: false
neo4j:
uri: "neo4j://localhost:7687"
user: "neo4j"
password: "password"
database: "neo4j"# Install with Poetry extras
poetry install --extras "graph"
# Or install all modules
poetry install --extras "all"Use the GraphRAG Docker profile:
docker-compose --profile app-graph up --build- Neo4j: Graph database (version 5.x)
- MinIO: Object storage for embeddings and reports
- Redis: For caching and job queues (shared with main application)
neo4j: Neo4j Python driverminio: MinIO Python SDKlangchain-aws: AWS integrations (for MinIO)igraph: Graph analysis librarypandas: Data processing for community reports
| Variable | Description | Default |
|---|---|---|
NEO4J_URI |
Neo4j connection URI | neo4j://localhost:7687 |
NEO4J_USER |
Neo4j username | neo4j |
NEO4J_PASSWORD |
Neo4j password | password |
NEO4J_DATABASE |
Neo4j database name | neo4j |
MINIO_ENDPOINT |
MinIO endpoint | localhost:9000 |
MINIO_ACCESS_KEY |
MinIO access key | minioadmin |
MINIO_SECRET_KEY |
MinIO secret key | minioadmin |
MINIO_SECURE |
Use HTTPS | false |
Configuration is centralized in service_conf.yaml. See service_conf.yaml.template for complete options.
GET /api/kg/health- Check Neo4j connection healthGET /api/kg/stats- Get database statistics (node count, relationship count, etc.)
POST /api/kg/nodes- Create a new nodeGET /api/kg/nodes/{node_id}- Read node by IDGET /api/kg/nodes?label=...- List nodes by labelGET /api/kg/nodes/search?label=...&property_key=...&property_value=...- Search nodes by propertyPUT /api/kg/nodes/{node_id}- Update nodePATCH /api/kg/nodes/{node_id}- Partially update nodeDELETE /api/kg/nodes/{node_id}- Delete node
POST /api/kg/relationships- Create relationship between nodesGET /api/kg/relationships/{relationship_id}- Read relationship by IDGET /api/kg/nodes/{node_id}/relationships?direction=...- List relationships for a nodePUT /api/kg/relationships/{relationship_id}- Update relationshipPATCH /api/kg/relationships/{relationship_id}- Partially update relationshipDELETE /api/kg/relationships/{relationship_id}- Delete relationship
POST /api/kg/create- Create/import knowledge graph from vector store namespacePOST /api/kg/add-document- Add a single document to existing knowledge graph and update community reportsPOST /api/kg/louvein-cluster- Perform Louvain clustering with MinIO storagePOST /api/kg/leiden-cluster- Perform Leiden clusteringPOST /api/kg/hierarchical- Perform Hierarchical Clustering
POST /api/kg/hybrid- Primary endpoint: Integrated hybrid search (Global + Parallel Retrieval + RRF + Expansion + Reranking)POST /api/kg/qa- Community/Global search on community reports
curl http://localhost:8000/api/kg/healthcurl -X POST http://localhost:8000/api/kg/nodes \
-H "Content-Type: application/json" \
-d '{
"label": "Document",
"properties": {
"title": "Introduction to RAG",
"content": "RAG stands for Retrieval Augmented Generation...",
"embedding": [0.1, 0.2, 0.3]
}
}'curl -X POST http://localhost:8000/api/kg/relationships \
-H "Content-Type: application/json" \
-d '{
"source_id": "123",
"target_id": "456",
"type": "REFERENCES",
"properties": {
"weight": 0.8,
"context": "citation"
}
}'curl "http://localhost:8000/api/kg/nodes?label=Document&limit=10"curl -X POST http://localhost:8000/api/kg/create \
-H "Content-Type: application/json" \
-d '{
"namespace": "my-documents",
"engine": {
"name": "pinecone",
"type": "serverless",
"apikey": "your-api-key",
"vector_size": 1536,
"index_name": "tilellm"
}
}'curl -X POST http://localhost:8000/api/kg/add-document \
-H "Content-Type: application/json" \
-d '{
"metadata_id": "doc_12345_uuid",
"namespace": "my-documents",
"engine": {
"name": "pinecone",
"type": "serverless",
"apikey": "your-api-key",
"index_name": "tilellm"
},
"deduplicate_entities": true,
"sparse_encoder": "splade",
"llm_key": "my-llm-key",
"model": "gpt-4"
}'curl -X POST http://localhost:8000/api/kg/hybrid \
-H "Content-Type: application/json" \
-d '{
"question": "What is the relationship between AI and machine learning?",
"namespace": "my-documents",
"engine": {
"name": "pinecone",
"type": "serverless",
"apikey": "your-api-key",
"vector_size": 1536,
"index_name": "tilellm"
}
}'{
"id": "string", # Auto-generated by Neo4j
"label": "string", # Node type (e.g., Document, Person)
"properties": { # Custom properties
"key": "value"
}
}{
"id": "string", # Auto-generated by Neo4j
"source_id": "string", # Source node ID
"target_id": "string", # Target node ID
"type": "string", # Relationship type (e.g., REFERENCES)
"properties": { # Custom properties
"key": "value"
}
}- Node Labels: Use PascalCase (e.g.,
Document,Person,Organization) - Relationship Types: Use UPPER_SNAKE_CASE (e.g.,
RELATES_TO,REFERENCES,CITES) - Properties: Use snake_case (e.g.,
created_at,document_id,embedding_vector)
The module uses a connection pool with the following settings:
- Max pool size: 50 connections (configurable)
- Acquisition timeout: 60 seconds
- Connection lifetime: 1 hour
The pool is initialized once and reused for all requests, ensuring optimal performance.
GraphRAG uses MinIO for storing:
- Community reports (Parquet format):
community-reports/ - Graph embeddings:
embeddings/ - Intermediate processing results:
intermediate/
Bucket naming follows the pattern: graphrag-{namespace}.
The module handles the following errors:
- 400 Bad Request: Validation failed, invalid data
- 404 Not Found: Resource not found
- 500 Internal Server Error: Database or internal errors
- 503 Service Unavailable: Neo4j or MinIO unavailable
Access interactive API documentation:
http://localhost:8000/docs
All endpoints are documented with interactive examples.
{
"label": "Document",
"properties": {
"content": "...",
"embedding": [...], # Vector embedding
"metadata": {...}
}
}{
"source_id": "doc1",
"target_id": "doc2",
"type": "SIMILAR_TO",
"properties": {
"similarity_score": 0.85
}
}# Document -> Sections -> Paragraphs
doc = create_node(label="Document", properties={...})
section = create_node(label="Section", properties={...})
create_relationship(doc.id, section.id, "CONTAINS")Use clustering endpoints (/api/kg/hierarchical) to automatically detect and organize related content into communities.
The Knowledge Graph module integrates seamlessly with Tiledesk LLM:
- Authentication: Uses the same JWT token system
- Vector Stores: Compatible with Pinecone and Qdrant
- Configuration: Centralized via
service_conf.yaml - Docker: Available via
app-graphprofile
For detailed technical documentation on how the Knowledge Graph module works, including:
- Creation process (
/api/kg/create) - Global Search (
/api/kg/qa) - Integrated Hybrid Search (
/api/kg/hybrid) - Role of LLMs, embeddings, reranking, and adaptive graph expansion
See the following reports:
- REPORT_it.md (Italian)
- REPORT.md (English)
For issues and questions, refer to the main project repository: https://github.com/Tiledesk/tiledesk-llm
Module Status: Active
Last Updated: December 2025