|
| 1 | +--- |
| 2 | +name: manage-external-lineage |
| 3 | +title: Manage External Lineage |
| 4 | +summary: Create and delete OpenLineage events to connect external systems to Snowflake's lineage graph. |
| 5 | +description: "Use when you need to connect external data sources (Postgres, MySQL, S3, Kafka, etc.) to Snowflake's lineage graph via the OpenLineage REST API, or when you need to delete external lineage relationships. Triggers: external lineage, openlineage event, send lineage, establish lineage, delete lineage, create lineage event, connect postgres to snowflake lineage, connect mysql to snowflake lineage, connect s3 to snowflake lineage, track data flow, document data pipeline, lineage api, ingest lineage." |
| 6 | +tools: |
| 7 | + - snowflake_sql_execute |
| 8 | + - snowflake_object_search |
| 9 | + - Bash |
| 10 | + - Read |
| 11 | + - Write |
| 12 | + - Edit |
| 13 | +prompt: Create an external lineage event linking my Postgres source table to a Snowflake table. |
| 14 | +language: en |
| 15 | +status: Published |
| 16 | +author: Snowflake Solutions Team |
| 17 | +type: snowflake |
| 18 | +--- |
| 19 | + |
| 20 | +# Manage External Lineage |
| 21 | + |
| 22 | +## Overview |
| 23 | + |
| 24 | +This skill creates and deletes OpenLineage `COMPLETE` events through Snowflake's external lineage REST API so external systems (Postgres, MySQL, S3, Kafka, DB2, Trino, etc.) appear in Snowsight's lineage graph alongside Snowflake objects. Use it to document cross-platform pipelines, show upstream sources feeding Snowflake tables, or show downstream destinations Snowflake feeds. |
| 25 | + |
| 26 | +## Prerequisites |
| 27 | + |
| 28 | +- `INGEST LINEAGE` privilege on the account (and `DELETE LINEAGE` for deletes). |
| 29 | +- Active Snowflake connection in your `cortex` session, OR a Programmatic Access Token (PAT) / JWT. |
| 30 | +- Python deps: `requests`, `snowflake-connector-python`. |
| 31 | + |
| 32 | +## Workflow |
| 33 | + |
| 34 | +### 1. Verify privileges |
| 35 | + |
| 36 | +```sql |
| 37 | +SHOW GRANTS ON ACCOUNT; |
| 38 | +-- GRANT INGEST LINEAGE ON ACCOUNT TO ROLE <role_name>; |
| 39 | +``` |
| 40 | + |
| 41 | +### 2. Verify the Snowflake target exists |
| 42 | + |
| 43 | +```sql |
| 44 | +DESCRIBE TABLE <database>.<schema>.<table_name>; |
| 45 | +``` |
| 46 | + |
| 47 | +### 3. Build the payload |
| 48 | + |
| 49 | +```json |
| 50 | +{ |
| 51 | + "eventType": "COMPLETE", |
| 52 | + "eventTime": "<ISO8601>", |
| 53 | + "job": {"namespace": "<job_namespace>", "name": "<job_name>"}, |
| 54 | + "run": {"runId": "<UUID>"}, |
| 55 | + "producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client", |
| 56 | + "schemaURL": "https://openlineage.io/spec/0-0-1/OpenLineage.json", |
| 57 | + "inputs": [{"namespace": "<source_ns>", "name": "<source_object>"}], |
| 58 | + "outputs": [{"namespace": "snowflake://<ORG>-<ACCOUNT>", "name": "<DB>.<SCHEMA>.<TABLE>"}] |
| 59 | +} |
| 60 | +``` |
| 61 | + |
| 62 | +Stop and show the payload to the user before sending. |
| 63 | + |
| 64 | +### 4. Send the event |
| 65 | + |
| 66 | +Recommended (uses your active `cortex` connection, no token wrangling): |
| 67 | + |
| 68 | +```bash |
| 69 | +SNOWFLAKE_CONNECTION_NAME=<connection> \ |
| 70 | + python <SKILL_DIR>/send_lineage_via_connection.py -p payload.json |
| 71 | +``` |
| 72 | + |
| 73 | +PAT/JWT alternative: |
| 74 | + |
| 75 | +```bash |
| 76 | +<SKILL_DIR>/send_lineage.sh -a <ACCOUNT> -t /path/to/token.txt -p payload.json |
| 77 | +``` |
| 78 | + |
| 79 | +### 5. Verify in Snowsight |
| 80 | + |
| 81 | +Catalog → Database Explorer → your table → **Lineage** tab. May take 1–2 minutes to reflect. |
| 82 | + |
| 83 | +## Deleting external lineage |
| 84 | + |
| 85 | +| Scenario | Params | Effect | |
| 86 | +|---|---|---| |
| 87 | +| Break specific edge | source + target | Removes that edge only | |
| 88 | +| Break all downstream | source only | Removes source → all targets | |
| 89 | +| Remove from graph | target only | Removes target regardless of source | |
| 90 | + |
| 91 | +```bash |
| 92 | +curl --globoff -X DELETE \ |
| 93 | + -H "Authorization: Bearer $API_KEY" \ |
| 94 | + "https://<ACCOUNT>.snowflakecomputing.com/api/v2/lineage/external-lineage?sourceNamespace=<SRC_NS>&sourceName=<SRC>&sourceDatasetType=External%20Node&targetName=<DB>.<SCHEMA>.<TABLE>&targetDatasetType=TABLE" |
| 95 | +``` |
| 96 | + |
| 97 | +DELETE always returns HTTP 200 — confirm in Snowsight. |
| 98 | + |
| 99 | +## Example: Postgres + MySQL → Snowflake |
| 100 | + |
| 101 | +```json |
| 102 | +{ |
| 103 | + "eventType": "COMPLETE", |
| 104 | + "eventTime": "2026-02-20T19:00:00.000Z", |
| 105 | + "job": {"namespace": "external-etl", "name": "customer_data_pipeline"}, |
| 106 | + "run": {"runId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"}, |
| 107 | + "producer": "https://github.com/OpenLineage/OpenLineage/blob/v1-0-0/client", |
| 108 | + "schemaURL": "https://openlineage.io/spec/0-0-1/OpenLineage.json", |
| 109 | + "inputs": [ |
| 110 | + {"namespace": "postgres://prod-db.example.com:5432", "name": "public.customer_signups"}, |
| 111 | + {"namespace": "mysql://warehouse.example.com:3306", "name": "raw.customer_raw"} |
| 112 | + ], |
| 113 | + "outputs": [ |
| 114 | + {"namespace": "snowflake://<ORG>-<ACCOUNT>", "name": "<DB>.<SCHEMA>.<TABLE>"} |
| 115 | + ] |
| 116 | +} |
| 117 | +``` |
| 118 | + |
| 119 | +## Common Mistakes |
| 120 | + |
| 121 | +- **Wrong `eventType`.** Only `COMPLETE` is processed; `START`, `RUNNING`, `FAIL` are silently ignored. |
| 122 | +- **Including `facets` on external objects.** Omit them — externals render as "External Node" automatically. |
| 123 | +- **Underscores in the account identifier.** Use `ORG-ACCOUNT`, not `ORG_ACCOUNT`. |
| 124 | +- **Not using `--globoff` with curl.** Without it, curl re-encodes `External%20Node` and the DELETE matches nothing. |
| 125 | +- **Trusting HTTP 200 from DELETE.** It always returns 200; verify in Snowsight. |
| 126 | +- **Case-mismatched namespaces.** Namespace and name values are case-sensitive; a typo creates a new orphan node. |
| 127 | +- **Mismatched delete direction.** If lineage was created with the external object as `inputs`, the delete `source` must be that same external node. |
| 128 | +- **Expecting external nodes in `GET_LINEAGE`.** They only appear in the Snowsight UI. |
| 129 | + |
| 130 | +## Limits |
| 131 | + |
| 132 | +- 1-year retention |
| 133 | +- 10,000 events per account |
| 134 | +- 1000-char max FQN |
| 135 | +- No column-level lineage |
| 136 | + |
| 137 | +## Reference files |
| 138 | + |
| 139 | +- `namespace_conventions.md` — namespace formats per source type |
| 140 | +- `token_setup.md` — PAT setup |
| 141 | +- `troubleshooting.md` — 401/403/404 fixes |
| 142 | +- `send_lineage_via_connection.py` — recommended sender |
| 143 | +- `send_lineage.sh`, `generate_payload.sh` — PAT-based alternatives |
0 commit comments