Global flags: --json (machine-readable output), --config <path> (default: .cartomancer.toml).
cartomancer init [--force] Scaffold a commented .cartomancer.toml
cartomancer scan <path> Local scan, no GitHub
cartomancer review <owner/repo> <pr> [--resume <id>]
One-shot PR review via CLI (resume from last checkpoint)
cartomancer history [--branch <name>] Browse past scan results
cartomancer findings [<scan-id>] [filters] Browse/search findings
cartomancer dismiss <scan-id> <index> Dismiss a false positive
cartomancer dismissed List dismissed findings
cartomancer undismiss <dismissal-id> Remove a dismissal
cartomancer serve Webhook server for GitHub events
cartomancer doctor Check dependencies and config health
Global --json flag switches all commands to machine-readable output.
| Command | --json shape |
|---|---|
scan |
{ "scan_id": <i64|null>, "findings": [...], "summary": { total, critical, error, warning, info } } |
history, findings, dismissed |
JSON array; empty results emit [] so pipelines stay valid |
doctor |
{ "checks": [...], "summary": { total, ok, warn, error } }, exit 1 if any check errors |
review --dry-run |
ReviewResult JSON |
Text output for scan prints the scan id on the first line (e.g. Scan id: 42) so it can be piped into dismiss or findings.
1. Prepare Clone repo or reuse --work-dir
2. Fetch Meta GitHub API → PrMetadata (head SHA, base SHA)
3. Fetch Diff GitHub API → raw unified diff → PullRequestDiff
4. Opengrep Scan subprocess --baseline-commit → Vec<Finding>
5. Graph Enrich cartog impact/refs → GraphContext per finding
6. Escalate blast radius + domain → adjusted Severity
7. LLM Deepen Ollama or Anthropic → analysis + suggested fix + agent prompt (conditional)
8. Regression Compare fingerprints against base branch baseline → new/existing
9. Dismiss Filter out dismissed findings by fingerprint match
10. Persist Write scan record + findings to SQLite (best-effort)
11. Post GitHub API → categorized inline comments + off-diff caution comments + summary with actionable counts
Stages 4-7 are shared between scan and review commands.
Stages 8-9 are review-only. Stage 10 runs for both scan and review.
Pipeline progress is persisted at coarse checkpoint stages, not after every numbered step. The checkpoints are: prepared (after steps 1-3: metadata + clone + diff), scanned (after step 4), enriched (after step 5), escalated (after step 6), deepened (after step 7), and completed (after step 11). --resume restarts from the last successful checkpoint. --dry-run skips step 11 (Post).
CLI: cartomancer review owner/repo 42
│
├─▶ Resolve GITHUB_TOKEN (env var or config)
│
▼ GitHubClient.fetch_pr_metadata()
PrMetadata { head_sha, base_sha, head_ref, base_ref }
│
├─▶ prepare_work_dir (clone with token auth, or reuse --work-dir)
│
▼ GitHubClient.fetch_diff() + parse_diff()
PullRequestDiff { chunks: Vec<DiffChunk>, files_changed }
│
▼ opengrep::run_opengrep(--baseline-commit base_sha, --exclude patterns)
Vec<Finding> (new findings only, from diff)
│
▼ CartogEnricher.enrich_batch_optimized() — deduped: outline per file, impact+refs per symbol
Vec<Finding> (with GraphContext: blast_radius, callers, domain_tags, is_public_api)
│
▼ SeverityEscalator.escalate_batch(rule_overrides)
Vec<Finding> (severity adjusted: graph context + per-rule min_severity/max_severity overrides)
│
▼ LlmProvider.deepen() — when severity >= threshold AND blast_radius > 3, OR always_deepen=true
Vec<Finding> (with llm_analysis, suggested_fix from ```diff fence, agent_prompt; company knowledge injected)
│
▼ Build ReviewResult + format comments
ReviewResult { findings, summary, head_sha }
│
▼ annotate_regression(base_branch baseline fingerprints)
Vec<Finding> (each annotated is_new: true/false)
│
▼ filter_dismissed(dismissed fingerprints from store)
Vec<Finding> (dismissed findings removed)
│
▼ persist_scan() → SQLite .cartomancer.db (best-effort, never blocks)
│
├─▶ --dry-run? → output JSON/text to stdout, skip posting
│
▼ Post to GitHub
├── Inline findings: POST /repos/{repo}/pulls/{pr}/reviews (categorized, collapsible fix + agent prompt)
├── Off-diff findings: POST /repos/{repo}/issues/{pr}/comments (caution banner)
└── Zero findings: POST summary comment ("no findings detected")
| Base Severity | Blast Radius | Domain | Callers | Final Severity |
|---|---|---|---|---|
| any | >= threshold*4 | any | any | Critical |
| any | >= threshold | any | any | Error (minimum) |
| any | any | auth | any | Critical |
| any | any | payment | any | Critical |
| any | any | any | >= 10 | Error (minimum) |
| any | < threshold | none | < 10 | unchanged |
Default blast_radius_threshold = 5 (configurable in .cartomancer.toml).
Per-rule overrides from [knowledge.rules] are applied around the matrix above:
min_severityis applied as a floor before graph-based escalationmax_severityis applied as a ceiling after all escalation
LlmProvider (async trait)
fn name() -> &str
fn complete(prompt) -> Result<String>
fn deepen(finding) -> Result<()> // default impl builds prompt + calls complete
│
├── OllamaProvider
│ POST http://localhost:11434/api/chat
│ No API key, stream: false
│ Default for local development
│
└── AnthropicProvider
POST {base_url}/v1/messages (default: https://api.anthropic.com)
x-api-key header, anthropic-version: 2023-06-01
max_tokens validated: 1..=128,000
For production use (with_base_url() for testing)
Provider selected via llm.provider in config. Factory: create_provider(&LlmConfig).
create_provider validates max_tokens before constructing the Anthropic provider.
Triggered only when:
- Finding severity >=
llm_deepening_threshold(default: Error) - AND blast radius > 3
Prompt includes:
- Finding details (rule, message, severity, file, code snippet)
- Enclosing function/class body (truncated to 2000 chars, when
enclosing_context = truein config) - Structural context from cartog (symbol name, blast radius, callers list, domain tags)
- Task: explain real-world impact (2-3 sentences) + provide suggested fix as ```diff fenced block
Response parsing:
parse_llm_response()scans for first ```diff fence → splits into analysis text + optional fix- If fix is present,
build_agent_prompt()generates a self-contained prompt for AI agents (file path, line range, rule, fix) - Empty diff blocks are normalized to None
CLI ──parse args──▶ cmd_review()
│
├─▶ Resolve GITHUB_TOKEN
│
▼
pipeline::run_pipeline()
│
├─▶ GitHubClient.fetch_pr_metadata()
│ └── GET /repos/:owner/:repo/pulls/:pr
│
├─▶ prepare_work_dir()
│ ├── --work-dir + .git exists → reuse
│ ├── --work-dir + empty → clone (kept)
│ └── no --work-dir → clone to temp dir (cleaned up)
│
├─▶ checkout_pr_head()
│ └── git fetch + checkout head SHA
│
├─▶ GitHubClient.fetch_diff()
│ └── GET /repos/:owner/:repo/pulls/:pr
│ Accept: application/vnd.github.diff
│
├─▶ parse_diff() → PullRequestDiff
│
├─▶ opengrep::run_opengrep(--baseline-commit, --exclude)
│ └── subprocess: opengrep scan --json
│
├─▶ CartogEnricher.enrich_batch_optimized()
│ ├── db.outline() per unique file → resolve symbols
│ ├── db.impact() per unique symbol → blast radius
│ ├── db.refs() per unique symbol → callers
│ └── detect_domain_tags() + distribute GraphContext
│
├─▶ SeverityEscalator.escalate_batch(rule_overrides)
│ └── threshold checks + per-rule min/max severity → severity upgrade
│
├─▶ LlmProvider.deepen() (when gates pass OR always_deepen)
│ └── load knowledge file → build prompt → POST /api/chat or /v1/messages
│
└─▶ PipelineResult { review, diff, branch, base_branch, scan_duration, rule_count, temp_dir }
│
cmd_review() continues:
│
├─▶ annotate_regression(base_branch → baseline fingerprints)
├─▶ filter_dismissed(dismissed fingerprints)
├─▶ persist_scan(scan record + findings → SQLite)
│
├─▶ --dry-run? → print ReviewResult, exit
│
├─▶ zero findings? → post_comment(clean summary)
│
└─▶ findings:
├── prepare_review_payload()
│ ├── is_line_in_diff() → categorized inline ReviewComment
│ ├── off-diff → format_off_diff_comment() (caution banner)
│ └── regenerate summary with off-diff listing
├── post_review(inline comments + summary)
└── post_comment() per off-diff finding
| Method | Endpoint | Purpose |
|---|---|---|
| GET | /repos/{repo}/pulls/{pr} |
PR metadata (head/base SHA) |
| GET | /repos/{repo}/pulls/{pr} + Accept: diff |
Raw unified diff |
| POST | /repos/{repo}/pulls/{pr}/reviews |
Atomic review with inline comments (COMMENT event) |
| POST | /repos/{repo}/issues/{pr}/comments |
Summary comment or off-diff findings |
All GET requests use single-retry with 1s delay on 5xx/network errors.
| Function | Output |
|---|---|
format_inline_comment(finding) |
Severity badge + category (Actionable/Nitpick), rule ID, message, snippet, blast radius, domain, escalation, LLM analysis, collapsible suggested fix, collapsible agent prompt, CWE |
format_off_diff_comment(finding) |
[!CAUTION] banner with file:line + full inline comment content |
format_summary(findings, off_diff, duration, rules) |
Actionable count, severity + category breakdown, top escalated findings, collapsible off-diff listing, scan metadata |
format_clean_summary(duration, rules) |
"No findings detected" with scan metadata |
classify_finding(finding) |
CommentCategory: Actionable (has fix OR severity >= Error) or Nitpick |
Scan and review results are persisted to a SQLite database (.cartomancer.db by default, configurable via storage.db_path in .cartomancer.toml).
Schema (3 tables, v5): scans (scan metadata + pipeline stage + error_message + work_dir), findings (per-finding data with fingerprint, suggested_fix, agent_prompt), dismissals (false positive suppression). Schema version tracked via PRAGMA user_version.
Fingerprint: SHA-256 of rule_id:file_path:snippet_content. Stable across scans — used for regression detection and dismissal matching. Line numbers excluded because they shift with unrelated edits.
Regression detection: During review, each finding's fingerprint is compared against the latest scan on the base branch. Findings present in the baseline are marked is_new: false; new findings are marked is_new: true.
Dismissal: Dismissed findings are suppressed by fingerprint match. If the code at a dismissed location changes (different snippet), the fingerprint changes and the finding reappears.
Best-effort: All persistence operations are non-blocking. If the DB is unavailable, the pipeline continues and logs a warning (BR-3).
- tokio async runtime for GitHub API calls and LLM calls
- Subprocess for opengrep (tokio::process::Command)
- Sync git clone/checkout (std::process::Command, blocking)
- Sync cartog Database access (rusqlite is not Send; access from a single task)
- Sync Store access (rusqlite, runs once after pipeline completes)
- Findings enrichment runs as a single batch: outline per file, impact+refs per symbol (deduplicated, single-connection)
- Per-finding errors: logged and skipped (partial results are better than no results)
- Pipeline-level errors: fail before posting (clone, opengrep, GitHub API)
- LLM errors: logged and skipped (review still posts without LLM analysis)
- Temp dir cleanup: always cleaned up, even on failure (via TempDir Drop)
- Store errors: logged and skipped — persistence never blocks the pipeline (BR-3)
The serve command runs an axum HTTP server that receives GitHub pull_request webhook events.
POST /webhook → HMAC-SHA256 validation → parse PullRequestEvent → should_review() filter
→ acquire semaphore permit → spawn background: run_pipeline + finalize_and_post
→ return 202 Accepted
GET /health → 200 OK
- HMAC validation:
X-Hub-Signature-256header verified againstgithub.webhook_secret - Concurrency: bounded by
serve.max_concurrent_reviews(default 4) via tokioSemaphore - Graceful shutdown: listens for SIGTERM/SIGINT
- Background tasks: each review runs independently with its own temp dir, GitHub client, and store connection
The pipeline persists progress to the scans table after each stage:
pending → prepared → scanned → enriched → escalated → deepened → completed
↘ failed
The prepared stage checkpoints after clone + checkout + diff fetch, before the opengrep scan. This allows --resume to skip the clone step when the scan crashes or times out during opengrep. The scan record also stores the work_dir path so resume can reuse the checkout.
Each stage writes findings to the store and advances the stage column. On failure, the scan is marked failed with an error message. The --resume <scan-id> flag on the review command allows restarting from the last completed stage.
cartomancer doctor runs a structured health report over everything the pipeline touches. Each check returns one of Ok / Warn / Error; the process exits non-zero when any check is in the Error state.
| Check | Failure level | What it verifies |
|---|---|---|
config |
Error | AppConfig::validate() passes |
git |
Error | git binary in PATH (required for review / serve) |
opengrep |
Error | opengrep --version responds within 10s |
custom-rules |
Warn | opengrep.rules_dir exists and contains at least one .yaml/.yml rule |
knowledge |
Warn | knowledge.knowledge_file is readable and within max_knowledge_chars |
cartog (CLI) |
Warn | cartog --version succeeds |
cartog-db |
Warn | the file at severity.cartog_db_path exists (graph enrichment skipped when missing) |
github-token |
Warn | GITHUB_TOKEN env or github.token is set |
llm-provider |
Warn | configured provider responds to a health check |
storage |
Error | SQLite store at storage.db_path can be opened |
Teams can place custom YAML rule files in .cartomancer/rules/ (configurable via opengrep.rules_dir). These are auto-discovered and passed to opengrep alongside built-in rules via --config <dir>.
- Rules encode team-specific business rules and coding conventions
- Directory is validated against path traversal before use
cartomancer doctorreports the number of discovered rule files- Missing directory is silently ignored (default rules only)
A markdown/text file (default: .cartomancer/knowledge.md) loaded once and injected into every LLM deepening prompt as a ## Company Context section. Gives the LLM awareness of team conventions, architecture, and security policies.
Additionally, a system_prompt field in [knowledge] config provides a short directive system message for the LLM (e.g. "You are reviewing a fintech codebase").
Security: knowledge file path is validated against path traversal. Binary files (containing null bytes) are rejected. Content is truncated at max_knowledge_chars (default: 8000).
The [knowledge.rules] section in .cartomancer.toml allows per-rule:
min_severity: severity floor (findings below this get upgraded)max_severity: severity ceiling (findings above this get capped)always_deepen: bypass severity/blast_radius gates for LLM deepening
- Multiple LLM providers (OpenAI, local models via LM Studio)
- GitLab / Bitbucket support
- Slack/Teams notifications for Critical findings
- Dashboard / web UI for trend visualization over stored scan history
- Retention policies (pruning old scans)
- Feedback loop: learn from dismissed findings