Skip to content

neo4j-partners/graph-enriched-lakehouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Graph-Enriched Lakehouse

Graph enrichment connects Neo4j Graph Data Science to a Databricks Lakehouse as a silver-to-gold pipeline stage. The pipeline reads Silver tables from Unity Catalog, loads the records into Neo4j as a property graph, runs graph algorithms against the network, and writes the results back to the Gold layer as plain Delta columns. Genie, SQL warehouses, dashboards, and downstream ML read those columns without modification. The analytics stack stays unchanged. The catalog gains dimensions it could not carry before.


Projects

A fraud-surfacing demo for Databricks account teams and partners. Financial crime is a network problem: fraud rings operate as connected patterns across many accounts and transactions, and the individual event looks clean while the connected pattern does not. A high-level synthetic fraud dataset loads into Neo4j Aura as a property graph. PageRank, Louvain community detection, and Node Similarity run against the projection and write risk_score, community_id, and similarity_score back to the Gold layer as plain Delta columns: centrality, community membership, and structural similarity materialized where every Databricks tool can reach them.

The demo runs in two phases. The BEFORE space queries unenriched Silver tables: Genie handles standard BI questions cleanly, then falls short on structural-discovery questions because network topology does not exist in flat rows. The AFTER space queries the enriched Gold tables: portfolio composition by risk tier, cohort comparisons across community membership, operational workload estimates, and merchant-side analysis conditioned on structural membership. Questions that require no graph knowledge to read, over a catalog that did not carry those dimensions before the pipeline ran.

Start here:

About

Graph-Enriched Lakehouse: Databricks + Neo4j Aura — fraud detection with graph features from GDS (PageRank, Louvain, Node Similarity) via the Neo4j Spark Connector

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages