@@ -788,6 +788,203 @@ The server exposes all clgraph lineage tools:
788788| ` pipeline://tables ` | List of all tables with metadata |
789789| ` pipeline://tables/{name} ` | Detailed info for a specific table |
790790
791+ ## CLI Reference
792+
793+ clgraph ships a command-line interface for analysing SQL lineage without writing Python.
794+
795+ ```
796+ clgraph [COMMAND] [OPTIONS]
797+ ```
798+
799+ ### ` clgraph analyze `
800+
801+ Parse SQL files and display a column-lineage summary.
802+
803+ ``` bash
804+ clgraph analyze PATH [--dialect DIALECT] [--format table| json| dot]
805+ ```
806+
807+ | Option | Default | Description |
808+ | --------| ---------| -------------|
809+ | ` PATH ` | * (required)* | SQL file, directory of ` .sql ` files, or JSON pipeline file |
810+ | ` --dialect ` | ` bigquery ` | SQL dialect (bigquery, snowflake, postgres, mysql, duckdb, clickhouse, …) |
811+ | ` --format ` , ` -f ` | ` table ` | Output format: ** table** (Rich table), ** json** (machine-readable), ** dot** (Graphviz) |
812+
813+ ### ` clgraph diff `
814+
815+ Compare lineage between two pipeline versions — useful for reviewing the impact of SQL changes in PRs.
816+
817+ ``` bash
818+ clgraph diff OLD_PATH NEW_PATH [--dialect DIALECT] [--format table| json]
819+ ```
820+
821+ | Option | Default | Description |
822+ | --------| ---------| -------------|
823+ | ` OLD_PATH ` | * (required)* | Path to old SQL file or directory |
824+ | ` NEW_PATH ` | * (required)* | Path to new SQL file or directory |
825+ | ` --dialect ` | ` bigquery ` | SQL dialect |
826+ | ` --format ` , ` -f ` | ` table ` | Output format: ** table** or ** json** |
827+
828+ ### ` clgraph mcp `
829+
830+ Start an MCP server so AI assistants (Claude Desktop, Cursor, etc.) can query your lineage graph.
831+
832+ ``` bash
833+ clgraph mcp --pipeline PATH [--dialect DIALECT] [--transport stdio| http] [--no-llm-tools]
834+ ```
835+
836+ | Option | Default | Description |
837+ | --------| ---------| -------------|
838+ | ` --pipeline ` , ` -p ` | * (required)* | Path to SQL directory or JSON pipeline file |
839+ | ` --dialect ` | ` bigquery ` | SQL dialect |
840+ | ` --transport ` | ` stdio ` | Transport type: ** stdio** (Claude Desktop) or ** http** (remote clients) |
841+ | ` --no-llm-tools ` | ` false ` | Exclude LLM-dependent tools from the server |
842+
843+ Requires: ` pip install clgraph[mcp] `
844+
845+ ---
846+
847+ ## End-to-End CLI Walkthrough
848+
849+ This walkthrough uses the example files in [ ` examples/cli_e2e/ ` ] ( examples/cli_e2e/ ) .
850+ The pipeline has three SQL files: ` users ` , ` orders ` , and a ` user_spend ` mart.
851+
852+ ### Step 1 — Analyze the pipeline
853+
854+ ``` bash
855+ $ clgraph analyze examples/cli_e2e/v1/
856+ ```
857+
858+ ```
859+ Pipeline Tables
860+ ┏━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┓
861+ ┃ Table ┃ Type ┃ Columns ┃ Upstream ┃ Downstream ┃
862+ ┡━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━┩
863+ │ users │ derived │ 4 │ 1 │ 1 │
864+ │ source_users │ source │ 4 │ 0 │ 1 │
865+ │ orders │ derived │ 4 │ 1 │ 1 │
866+ │ source_orders │ source │ 4 │ 0 │ 1 │
867+ │ user_spend │ derived │ 7 │ 2 │ 0 │
868+ └───────────────┴─────────┴─────────┴──────────┴────────────┘
869+
870+ 5 tables, 23 columns, 15 lineage edges
871+ ```
872+
873+ clgraph discovered 5 tables (2 sources, 2 intermediate, 1 final mart) and traced 15 column-level lineage edges — all from static SQL analysis.
874+
875+ ### Step 2 — Get JSON output for CI/scripts
876+
877+ ``` bash
878+ $ clgraph analyze examples/cli_e2e/v1/ --format json
879+ ```
880+
881+ ``` json
882+ {
883+ "dialect" : " bigquery" ,
884+ "tables" : [
885+ {
886+ "name" : " users" ,
887+ "is_source" : false ,
888+ "columns" : [
889+ {"name" : " user_id" , "type" : " direct_column" , "pii" : false },
890+ {"name" : " email" , "type" : " direct_column" , "pii" : true },
891+ {"name" : " signup_date" , "type" : " direct_column" , "pii" : false },
892+ {"name" : " country" , "type" : " direct_column" , "pii" : false }
893+ ]
894+ }
895+ ],
896+ "columns" : 23 ,
897+ "edges" : 15 ,
898+ "issues" : 0
899+ }
900+ ```
901+
902+ Notice that ` email ` already has ` "pii": true ` — parsed from the ` [pii: true] ` comment in the SQL.
903+
904+ ### Step 3 — Generate a Graphviz diagram
905+
906+ ``` bash
907+ $ clgraph analyze examples/cli_e2e/v1/ --format dot | dot -Tpng -o lineage.png
908+ ```
909+
910+ The DOT output is a standard Graphviz ` digraph ` that shows table dependencies:
911+
912+ ``` dot
913+ digraph {
914+ rankdir=LR
915+ source_users -> users [label=CREATE]
916+ source_orders -> orders [label=CREATE]
917+ users -> user_spend [label=CREATE]
918+ orders -> user_spend [label=CREATE]
919+ }
920+ ```
921+
922+ ### Step 4 — Diff two versions to review impact
923+
924+ Now suppose a teammate adds a ` tier ` column to users, a ` discount ` column to orders,
925+ and updates ` user_spend ` to compute ` lifetime_net_spend ` . Compare old vs new:
926+
927+ ``` bash
928+ $ clgraph diff examples/cli_e2e/v1/ examples/cli_e2e/v2/
929+ ```
930+
931+ ```
932+ +6 columns added
933+ + source_orders.discount
934+ + orders.discount
935+ + user_spend.tier
936+ + users.tier
937+ + user_spend.lifetime_net_spend
938+ + source_users.tier
939+ ```
940+
941+ The diff tells you exactly which columns were added, removed, or modified across the entire pipeline — perfect for code review or CI gates.
942+
943+ ### Step 5 — Get diff as JSON for automation
944+
945+ ``` bash
946+ $ clgraph diff examples/cli_e2e/v1/ examples/cli_e2e/v2/ --format json
947+ ```
948+
949+ ``` json
950+ {
951+ "columns_added" : [
952+ " source_orders.discount" ,
953+ " orders.discount" ,
954+ " source_users.tier" ,
955+ " user_spend.lifetime_net_spend" ,
956+ " users.tier" ,
957+ " user_spend.tier"
958+ ],
959+ "columns_removed" : [],
960+ "columns_modified" : [],
961+ "has_changes" : true
962+ }
963+ ```
964+
965+ ### Step 6 — Serve lineage to AI via MCP
966+
967+ ``` bash
968+ $ clgraph mcp --pipeline examples/cli_e2e/v1/
969+ ```
970+
971+ This starts an MCP server on stdio. Connect it to Claude Desktop by adding to your config:
972+
973+ ``` json
974+ {
975+ "mcpServers" : {
976+ "clgraph" : {
977+ "command" : " clgraph" ,
978+ "args" : [" mcp" , " --pipeline" , " /path/to/your/sql/" ]
979+ }
980+ }
981+ }
982+ ```
983+
984+ Then ask Claude: * "What tables does user_spend depend on?"* or * "Which columns contain PII?"*
985+
986+ ---
987+
791988## Architecture
792989
793990> 📊 ** [ View the complete architecture diagram] ( clgraph-simple-diagram.md ) ** - A visual overview of the 4-stage flow from SQL input to applications.
0 commit comments