You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -11,46 +11,42 @@ A Python library for SQL column lineage analysis. No database required. No infra
11
11
-**Auto-propagate Metadata** — PII flags, ownership, and descriptions flow automatically through lineage
12
12
-**Context for AI Agents** — Provide LLMs with structured lineage data for smarter data assistance
13
13
-**CI/CD Change Detection** — Detect lineage changes between pipeline versions for automated testing
14
+
-**Automatic DAG Construction** — Execute pipelines in Python (async or sequential) with topological ordering, or generate Airflow DAGs
14
15
15
16
## Why We Built This
16
17
17
-
Column lineage is notoriously difficult. Traditional tools reverse-engineer lineage from query logs and execution metadata, requiring expensive platform integration and complex infrastructure. Most open-source alternatives focus only on table-level lineage or single-query column analysis.
18
+
**Your SQL already contains everything.** Tables, columns, transformations, joins—it's all there in your code.
18
19
19
-
**Our insight**: When SQL is written with explicit column names and clear transformations (what we call "[lineage-friendly SQL](https://clgraph.dev/blog/writing-lineage-friendly-sql/)"), static analysis can provide *perfect* column lineage—without database access, without runtime integration, and without query logs.
20
+
Traditional tools reverse-engineer lineage from query logs and database metadata, requiring expensive infrastructure. But when SQL is written with explicit column names and clear transformations (what we call "[lineage-friendly SQL](https://clgraph.dev/blog/writing-lineage-friendly-sql/)"), static analysis can build a *complete*lineage graph—without database access, without runtime integration, and without query logs.
20
21
21
-
We built clgraph to prove this approach works. By combining lineage-friendly SQL with perfect static analysis, we solve 90% of column lineage needs with 10% of the complexity of enterprise tools. No database required. No infrastructure to maintain. Just pure Python analyzing your SQL files.
22
+
**We parse it once. You get the complete graph.** It's a Python object you can traverse, query, and integrate however you want — powering tracing, impact analysis, metadata propagation, DAG construction, and more.
22
23
23
24
**Read more**:
24
25
-[Why We Built This (Full Story)](https://clgraph.dev/concepts/why-we-built-this/)
25
26
-[How to Write Lineage-Friendly SQL](https://clgraph.dev/blog/writing-lineage-friendly-sql/)
26
27
27
28
## Features
28
29
29
-
### Column Lineage Analysis
30
-
-**Perfect column lineage** for any single SQL query, no matter how complex
31
-
-**Recursive query parsing** - handles arbitrary nesting of CTEs and subqueries
0 commit comments