For AI Coding Assistants: This document provides comprehensive guidance for understanding and developing Cartography intel modules. It contains codebase-specific patterns, architectural decisions, and implementation details necessary for effective AI-assisted development within the Cartography project.
This guide teaches you how to write intel modules for Cartography using the modern data model approach. We'll walk through real examples from the codebase to show you the patterns and best practices.
- Procedure Documentation - Links to detailed guides
- AI Assistant Quick Reference - Key concepts and imports
- Git and Pull Request Guidelines - Commit signing and PR templates
- Quick Start - Copy an existing module
- Quick Reference Cheat Sheet - Copy-paste templates
Detailed procedures are available in separate documents:
| Procedure | Description |
|---|---|
| Creating a New Module | Complete guide to creating a new Cartography intel module |
| Enriching the Ontology | Adding ontology mappings for cross-module querying |
| Adding a New Node Type | Advanced node schema properties and configurations |
| Adding a New Relationship | Relationships, MatchLinks, and multi-module patterns |
| Adding Analysis Jobs | Post-ingestion graph enrichment and cross-resource analysis |
| Creating Security Rules | Security rules, facts, and compliance conventions |
| Refactoring Legacy Code | Converting legacy Cypher to modern data model |
| Troubleshooting | Common errors, debugging tips, and key files reference |
Key Cartography Concepts:
- Intel Module: Component that fetches data from external APIs and loads into Neo4j
- Sync Pattern:
get()->transform()->load()->cleanup()->analysis(optional) - Data Model: Declarative schema using
CartographyNodeSchemaandCartographyRelSchema - Update Tag: Timestamp used for cleanup jobs to remove stale data
- Analysis Jobs: Post-ingestion queries that enrich the graph (e.g., internet exposure, permission inheritance)
Critical Files to Know:
cartography/config.py- Configuration object definitionscartography/cli.py- Typer-based CLI with organized help panelscartography/client/core/tx.py- Coreload()functioncartography/graph/job.py- Cleanup job utilitiescartography/models/core/- Base data model classes
Essential Imports:
import logging
from dataclasses import dataclass
from cartography.models.core.common import PropertyRef
from cartography.models.core.nodes import CartographyNodeProperties, CartographyNodeSchema, ExtraNodeLabels
from cartography.models.core.relationships import (
CartographyRelProperties, CartographyRelSchema, LinkDirection,
make_target_node_matcher, TargetNodeMatcher, OtherRelationships,
make_source_node_matcher, SourceNodeMatcher,
)
from cartography.client.core.tx import load, load_matchlinks
from cartography.graph.job import GraphJob
from cartography.util import timeit
# For analysis jobs (optional)
from cartography.util import run_analysis_job, run_scoped_analysis_job, run_analysis_and_ensure_deps
logger = logging.getLogger(__name__)PropertyRef Quick Reference:
PropertyRef("field_name") # Value from data dict
PropertyRef("KWARG_NAME", set_in_kwargs=True) # Value from load() kwargs
PropertyRef("field", extra_index=True) # Create database index
PropertyRef("field_list", one_to_many=True) # One-to-many relationshipsDebugging Tips:
- Check existing patterns in
cartography/intel/before creating new ones - Ensure
__init__.pyfiles exist in all module directories - Look at
tests/integration/cartography/intel/for similar test patterns - Review
cartography/models/for existing relationship patterns
Signing Commits: All commits must be signed using the -s flag. This adds a Signed-off-by line to your commit message, certifying that you have the right to submit the code under the project's license.
# Sign a commit with a message
git commit -s -m "feat(module): add new feature"Pull Request Descriptions: When creating a pull request, use the template at .github/pull_request_template.md.
The fastest way to get started is to copy the structure from an existing module:
- Simple module:
cartography/intel/lastpass/- Basic user sync with API calls - Complex module:
cartography/intel/aws/ec2/instances.py- Multiple relationships and data types - Reference documentation:
docs/root/dev/writing-intel-modules.md
For detailed step-by-step instructions, see Creating a New Module.
@timeit
def sync(neo4j_session: neo4j.Session, api_key: str, tenant_id: str,
update_tag: int, common_job_parameters: dict[str, Any]) -> None:
"""
Main sync entry point for the module.
"""
logger.info("Starting MyResource sync")
# 1. GET - Fetch data from API
logger.debug("Fetching MyResource data from API")
raw_data = get(api_key, tenant_id)
# 2. TRANSFORM - Shape data for ingestion
logger.debug("Transforming %d MyResource items", len(raw_data))
transformed = transform(raw_data)
# 3. LOAD - Ingest to Neo4j
load_entities(neo4j_session, transformed, tenant_id, update_tag)
# 4. CLEANUP - Remove stale data
logger.debug("Running MyResource cleanup job")
cleanup(neo4j_session, common_job_parameters)
logger.info("Completed MyResource sync")def load_entities(neo4j_session: neo4j.Session, data: list[dict],
tenant_id: str, update_tag: int) -> None:
load(neo4j_session, YourSchema(), data,
lastupdated=update_tag, TENANT_ID=tenant_id)
def cleanup(neo4j_session: neo4j.Session, common_job_parameters: dict[str, Any]) -> None:
logger.debug("Running cleanup job for MyResource")
GraphJob.from_node_schema(YourSchema(), common_job_parameters).run(neo4j_session)@dataclass(frozen=True)
class YourNodeProperties(CartographyNodeProperties):
id: PropertyRef = PropertyRef("id") # REQUIRED
lastupdated: PropertyRef = PropertyRef("lastupdated", set_in_kwargs=True) # REQUIRED
# Your business properties here...# OUTWARD: (:Source)-[:REL]->(:Target)
direction: LinkDirection = LinkDirection.OUTWARD
# INWARD: (:Source)<-[:REL]-(:Target)
direction: LinkDirection = LinkDirection.INWARD# Transform: Create list field
{"entity_id": "123", "related_ids": ["a", "b", "c"]}
# Schema: Use one_to_many=True
target_node_matcher: TargetNodeMatcher = make_target_node_matcher({
"id": PropertyRef("related_ids", one_to_many=True),
})@dataclass(frozen=True)
class YourMatchLinkSchema(CartographyRelSchema):
target_node_label: str = "TargetNode"
target_node_matcher: TargetNodeMatcher = make_target_node_matcher({
"id": PropertyRef("target_id"),
})
source_node_label: str = "SourceNode"
source_node_matcher: SourceNodeMatcher = make_source_node_matcher({
"id": PropertyRef("source_id"),
})
direction: LinkDirection = LinkDirection.OUTWARD
rel_label: str = "CONNECTS_TO"
properties: YourMatchLinkRelProperties = YourMatchLinkRelProperties()
# Required properties for MatchLinks
@dataclass(frozen=True)
class YourMatchLinkRelProperties(CartographyRelProperties):
lastupdated: PropertyRef = PropertyRef("lastupdated", set_in_kwargs=True)
_sub_resource_label: PropertyRef = PropertyRef("_sub_resource_label", set_in_kwargs=True)
_sub_resource_id: PropertyRef = PropertyRef("_sub_resource_id", set_in_kwargs=True)
# Load and cleanup MatchLinks
load_matchlinks(neo4j_session, YourMatchLinkSchema(), mapping_data,
lastupdated=update_tag, _sub_resource_label="AWSAccount", _sub_resource_id=account_id)
GraphJob.from_matchlink(YourMatchLinkSchema(), "AWSAccount", account_id, update_tag).run(neo4j_session)cartography/intel/your_service/
├── __init__.py # Main entry point
└── entities.py # Domain sync modules
cartography/models/your_service/
├── entity.py # Data model definitions
└── tenant.py # Tenant model
tests/data/your_service/
└── entities.py # Mock test data
tests/integration/cartography/intel/your_service/
└── test_entities.py # Integration tests
from tests.integration.util import check_nodes, check_rels
# Check nodes
expected_nodes = {("user-123", "alice@example.com")}
assert check_nodes(neo4j_session, "YourServiceUser", ["id", "email"]) == expected_nodes
# Check relationships
expected_rels = {("user-123", "tenant-123")}
assert check_rels(
neo4j_session,
"YourServiceUser", "id",
"YourServiceTenant", "id",
"RESOURCE",
rel_direction_right=True,
) == expected_relsRemember: Start simple, iterate, and use existing modules as references. The Cartography community is here to help!