Role: The "Brain" - Storage Engine, Transaction Manager, and Source of Truth.
The Graph Core is the central nervous system of ModelKG. It is the only service permitted to write directly to the Neo4j database.
In many graph projects, developers connect their various applications directly to the Graph DB using a generic driver. While fast initially, this leads to chaos at scale:
- Inconsistent Logic: One app deletes a node but forgets to delete the connecting edges, leaving "Dangling Pointers."
- No Audit Trail: Changes happen without consistent attribution or history.
- Fragility: Complicated Cypher queries are scattered across 10 microservices, making schema refactoring a nightmare.
- Security Gaps: It is hard to enforce row-level (or node-level) security when every app has root DB access.
ModelKG employs the Managed Graph Pattern. The Graph Core acts as a strictly controlled API facade over the database, ensuring ACID compliance, History Tracking, and Event Emission for every single operation.
CHAOTIC ACCESS (Bad) MANAGED ACCESS (ModelKG)
[App A] --(Cypher)--> [Neo4j] [App A] \
+--> [ GRAPH CORE ]
[App B] --(Cypher)--> [Neo4j] [App B] / Logic & Safety
|
[Script]--(Cypher)--> [Neo4j] v
[ NEO4J ]
Graph Core does not just pass queries through. It implements an internal Cypher Abstraction Layer (CAL). This Python-based query builder ensures that all generated queries follow best practices.
CAL forbids string concatenation for query building. All values are passed as parameters ( $param ).
- Security: Impossible to "SQL Inject" (Cypher Inject).
- Performance: Neo4j can cache the query plan because the query structure remains static even if values change.
Every API call is wrapped in a discrete transaction.
- Atomic Updates: If you send a batch request to
Create 50 Nodes, and the 50th one fails validation, all 50 are rolled back. - Isolating Noise: Dirty reads are prevented by enforcing strict isolation levels.
ModelKG structures the graph using specific patterns to allow for scale, reusability, and multi-tenancy.
A fundamental challenge in graph modeling is Data Redundancy. If you manage 50,000 laptops in an enterprise, 95% of their data is identical (Manufacturer: Apple, Model: MacBook Pro M3, Ports: 3x Thunderbolt). Storing these strings 50,000 times is wasteful and makes updating specifications impossible.
The Solution: Separation of Definition (Master) and Implementation (Instance).
[ Master Node: "MacBook Pro M3" ]
| label: ProductModel
| cpu: "M3 Max"
| ports: ["USBC", "HDMI"]
| maintenance_guide: "url..."
|
^ (Relationship: INSTANCE_OF)
|
+-------+--------------------+---------------------+
| | |
[ Instance: UserA_Laptop ] [ Instance: UserB_Laptop ]
| label: Asset | label: Asset
| serial: X7129 | serial: Y9912
| location: NY | location: LDN
| owner: "Alice" | owner: "Bob"
- Read-Through Logic (The "Virtual Property"):
When you query
UserA_Laptop, the Graph Core performs a "Resolve Master" operation.GET /nodes/UserA?resolve_master=true- The engine fetches the Instance properties.
- It checks for an
INSTANCE_OFrelationship. - It merges the Master properties under the Instance properties.
- Result:
{serial: X7129, cpu: "M3 Max", maintenance_guide: "url..."}. - Benefit: This is "prototypal inheritance" at the database level.
- Mass Updates: If Apple recalls the battery, you update the Master Node with
recall_active: true. Instantly, all 50,000 instances reflect this status in future API calls and analytics reports.
The database contains "System Nodes" that describe the graph itself. These start with an underscore _ to distinguish them from user data.
_Conceptnodes: A graph representation of the Ontology.- Often, the Ontology lives in Postgres, but we mirror it into Neo4j for fast queries.
(Node:Server)-[:IS_A]->(_Concept:Server)- Allows for super-fast type queries ("Find all Concepts that inherit from Asset").
_Tenantnodes: Used for multi-tenancy.- Strict Rule: All data nodes must have a
BELONGS_TOpath to a_Tenantnode. - This allows physical separation of data on a shared graph.
- Strict Rule: All data nodes must have a
Data is rarely static. In a Knowledge Graph, when something was true is often as important as what is true. ModelKG implements a Head/History versioning pattern (Slowly Changing Dimensions Type 2).
We do not overwrite critical data; we append it.
TIMELINE: T1 (Creation) TIMELINE: T2 (Update Status)
(Head Pointer) (Head Pointer MOVES)
| |
v v
[ Node: Task-A ] [ Node: Task-A_v2 ] ----[:PREVIOUS_VERSION]---> [ Node: Task-A_v1 ]
| status: TODO | | status: DONE | | status: TODO |
| active: true | | active: true | | active: false |
| modified: T2 | | valid_until: T2 |
When updating a node flagged with the Versioned trait:
- Lock: Acquire a write lock on the node.
- Clone: Copy the existing node entirely to a new node, suffixing the ID or using a dedicated history UUID.
- Update: Update the original (Head) node with the new properties.
- Link: Create a
PREVIOUS_VERSIONedge from Head -> Clone. - Timestamp: Set
valid_until = NOW()on the clone.
- Standard Read: By default, Cypher queries (
MATCH (n)) only match the Head node (we filter out history nodes via Label strategies like:History). - Time Travel Query: "What was the status of Project X last Tuesday?"
- The Graph Core functionality traverses the
PREVIOUS_VERSIONchain. - It looks for the node where
created_at <= TuesdayANDvalid_until > Tuesday. - It effectively reconstructs the state of the world at that moment.
- The Graph Core functionality traverses the
The true power of a graph is "the space between the nodes." Graph Core exposes powerful "Graph Algorithms as a Service."
A common frontend problem: "I have a User ID. I need their Department, their Projects, and their Manager." In REST, this is 3 calls. In GraphQL, it's one. ModelKG offers a tunable Expansion Endpoint.
Endpoint: GET /api/expand/{node_id}
Parameters:
depth(int): How many hops to go? (e.g.,depth=2)types(list): Filter on edge types (e.g.,types=MEMBER_OF|MANAGES)direction(enum):INCOMING,OUTGOING, orBOTH.
Visual Example:
[ User: Alice ]
|
+---(MEMBER_OF)--> [ Team: DevOps ] --(RESPONSIBLE_FOR)--> [ Svc: Payment API ]
|
+---(OWNS)--> [ Device: Laptop ] (Ignored if filter excludes OWNS)
Response Format: The API returns a JSON representation of the Subgraph.
{
"center": "User:Alice",
"nodes": [ ... list of all found nodes ... ],
"edges": [ ... list of valid edges ... ]
}This reduces network chatter significantly.
Use Case: A physical Router goes down. What abstract Business Capabilities are affected?
The Graph Core runs a Recursive Dependency Trace:
- Start Node:
[Asset: Router-X]. - Breadth-First Search (BFS): Traverse
incomingrelationships. - Edge Whitelist: Only follow
DEPENDS_ON,HOSTED_ON,POWERED_BY. - Stop Condition: Stop when we hit nodes labeled
BusinessCapability.
Result:
Router-X <- Server-01 <- Database-Main <- App-Billing <- Capability:Revenue Collection.
- Conclusion: "Revenue Collection is at Risk."
- Action: Graph Core returns this chain to the Action Executor to generate an alert.
Graph Core is not just a database wrapper; it is a Broadcaster. Systems like the Analytics Engine and Action Executor act on changes. They need guaranteed delivery of change events.
If we just "Write to DB" -> "Post to Kafka", we risk dual-write issues. If Kafka is down, the DB write succeeds but the event is lost. Downstream systems become desynchronized.
Solution: The Outbox
- Transaction Start.
- Write Data: Node changes applied to Neo4j.
- Write Event: Event JSON payload is written to a special
_Outboxnode inside Neo4j within the same transaction. - Commit Transaction (Atomic).
- Post-Commit Hook: A dedicated background thread (The Event Publisher) polls Neo4j for
_Outboxnodes. - Publish: It pushes the payload to Redpanda.
- Cleanup: Upon receiving an ACK from Redpanda, it deletes the
_Outboxnode.
We adhere to a strict Event Schema to ensure consumers don't break.
Topic: modelkg.graph.changed
{
"event_id": "evt-88219-abs-221",
"trigger": "UserUpdateAPI",
"timestamp": 1704200000,
"metadata": {
"user_id": "admin-alice",
"correlation_id": "req-123"
},
"payload": {
"operation": "UPDATE_PROPERTY",
"node_id": "node-555",
"concept": "Server",
"labels": ["Server", "Asset"],
"changes": {
"status": {
"old": "ONLINE",
"new": "OFFLINE"
}
}
}
}This standardized payload allows downstream systems to filter events efficiently (e.g., "Only wake me up for Server changes").
Scenario: An engineer models a "Perfect Server Setup" manually as a regular node instance (Server-Temp) and wants to make it a standard availability.
Feature: The Graph Core "Promote" endpoint.
- User clicks "Promote to Master".
- Graph Core validates the node (strips instance-specific data like Serial Numbers).
- Graph Core flips labels from
AssettoAssetModel(Master). - The node is now available in the catalog for instantation.
Scenario: Ingestion brings in "Oracle DB" and "Oracle Database" as two separate nodes, splitting the graph. Feature: Graph Core "Merge" endpoint.
- User selects Winner (
Oracle DB) and Loser (Oracle Database). - Graph Core Rewires all edges: Anything pointing to Loser is moved to point to Winner.
- Graph Core copies unique properties from Loser to Winner (merging history).
- Graph Core tags Loser as
_Tombstone(soft delete) or deletes it. - Result: The topology is healed without manual edge-by-edge editing.
The Graph Core is the heavy lifter of the platform. By abstracting the database, it allows us to implement sophisticated features like Time Travel, Master Templates, and Transactional Eventing that would be impossible to maintain if every microservice wrote raw Cypher queries. It turns Neo4j from a storage engine into a Dynamic, Versioned, and Event-Driven Object Modeling Platform.