|
1 | | -# Neo4j + Databricks Integration Demo |
| 1 | +# Neo4j + Databricks Integration Lab |
2 | 2 |
|
3 | | -A comprehensive demonstration of integrating Neo4j graph data with Databricks lakehouse architecture for retail investment analysis. |
| 3 | +A hands-on lab for building graph-augmented AI systems using Neo4j and Databricks. This project demonstrates how to combine Neo4j's graph database capabilities with Databricks AI/BI agents to create a multi-agent architecture that bridges structured graph data and unstructured documents. |
4 | 4 |
|
5 | 5 | ## Overview |
6 | 6 |
|
7 | | -This project demonstrates bidirectional data flow between Neo4j and Databricks: |
| 7 | +This lab walks through building a graph augmentation pipeline that leverages: |
8 | 8 |
|
9 | | -1. **Upload to Databricks** - Load source CSV files to Unity Catalog volumes |
10 | | -2. **Import to Neo4j** - Build a graph database from the uploaded data |
11 | | -3. **Export to Lakehouse** - Extract graph data back to Delta Lake tables |
| 9 | +- **Neo4j** for storing and querying connected data as a property graph |
| 10 | +- **Databricks Unity Catalog** for governed data storage (Delta Lake tables and document volumes) |
| 11 | +- **Neo4j Spark Connector** for bidirectional data transfer between the lakehouse and graph database |
| 12 | +- **Databricks Genie Agent** for natural language queries against structured Delta Lake tables |
| 13 | +- **Databricks Knowledge Agent** for RAG-based retrieval over unstructured documents |
| 14 | +- **Multi-Agent Supervisor** for coordinating structured and unstructured data analysis |
| 15 | +- **DSPy Framework** for structured reasoning and graph schema augmentation suggestions |
| 16 | + |
| 17 | +The architecture enables a continuous enrichment loop: graph data exports to the lakehouse for agent analysis, agents identify gaps between structured records and document content, and validated enrichments write back to Neo4j as new relationships and properties. |
| 18 | + |
| 19 | +``` |
| 20 | +┌─────────────────┐ ┌─────────────────────────────────────────────────┐ |
| 21 | +│ │ │ DATABRICKS LAKEHOUSE │ |
| 22 | +│ Neo4j Graph │────▶│ Delta Tables ◀──▶ Genie Agent │ |
| 23 | +│ │ │ UC Volumes ◀──▶ Knowledge Agent │ |
| 24 | +│ 7 node types │ │ │ │ |
| 25 | +│ 7 rel types │◀────│ Multi-Agent Supervisor │ |
| 26 | +│ │ │ │ │ |
| 27 | +│ │ │ DSPy Augmentation Agent │ |
| 28 | +└─────────────────┘ └─────────────────────────────────────────────────┘ |
| 29 | +``` |
12 | 30 |
|
13 | 31 | ### Data Model |
14 | 32 |
|
15 | | -The graph models a retail investment platform where **customers** own **accounts** at various **banks**. Accounts can hold investment **positions** in **stocks** issued by **companies**, and accounts perform financial **transactions** that transfer money to other accounts. |
| 33 | +The sample graph models a retail investment domain with **customers**, **accounts**, **banks**, **transactions**, **positions**, **stocks**, and **companies**. |
16 | 34 |
|
17 | 35 | ``` |
18 | 36 | Customer ──owns──> Account ──held at──> Bank |
|
0 commit comments