Skip to content
This repository was archived by the owner on Feb 16, 2026. It is now read-only.

Commit b83e328

Browse files
committed
docs - dagster asset prefix key
1 parent da2fb1b commit b83e328

File tree

6 files changed

+182
-2
lines changed

6 files changed

+182
-2
lines changed

content/v1.11.x/connectors/pipeline/dagster/index.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,28 @@ For a complete guide on managing secrets in hybrid setups, see the [Hybrid Inges
8383
- Give your API key a name and click on the "Create API Key" button.
8484
- Copy the generated API key to your clipboard and paste it in the field.
8585

86+
### Strip Asset Key Prefix $(id="stripAssetKeyPrefix")
87+
88+
Number of leading segments to remove from asset key paths before resolving to tables.
89+
90+
**About Dagster Asset Keys:**
91+
92+
Dagster asset keys are path-like identifiers represented as arrays of strings (e.g., `["project", "environment", "schema", "table"]`). When OpenMetadata ingests Dagster pipelines, it tries to match these asset keys to table entities using the standard format: `database.schema.table` or `schema.table`.
93+
94+
**When to Use This Setting:**
95+
96+
If your Dagster asset keys include additional prefix segments beyond the database/schema/table hierarchy, use this setting to strip those prefixes. For example:
97+
- Asset key: `["project", "environment", "schema", "table"]`
98+
- Set value to `2` to strip `project` and `environment`
99+
- Result: `schema.table` (matches OpenMetadata table entities)
100+
101+
Common use cases include stripping:
102+
- Project/workspace identifiers
103+
- Environment names (dev/staging/prod)
104+
- Storage bucket/container prefixes
105+
106+
Default value is `0` (no stripping).
107+
86108
{% /extraContent %}
87109

88110
{% partial file="/v1.11/connectors/test-connection.md" /%}
@@ -142,6 +164,53 @@ def customer_orders():
142164
| `["schema", "table"]` | Schema and table only |
143165
| `["table"]` | Table name only |
144166

167+
**Using stripAssetKeyPrefix for Asset Keys with Prefixes**
168+
169+
If your asset keys include additional prefix segments (e.g., project name, environment), use the `stripAssetKeyPrefix` configuration to remove them before matching to tables:
170+
171+
**Example 1: Stripping Environment Prefix**
172+
173+
```python
174+
# Your Dagster asset keys include environment prefix
175+
@asset(key=["prod", "analytics_db", "public", "customers"])
176+
def customers():
177+
...
178+
179+
@asset(
180+
key=["prod", "analytics_db", "public", "orders"],
181+
deps=[customers]
182+
)
183+
def orders():
184+
...
185+
```
186+
187+
**Configuration:**
188+
```yaml
189+
sourceConfig:
190+
config:
191+
stripAssetKeyPrefix: 1 # Remove the first segment ("prod")
192+
```
193+
194+
**Result:** Asset keys become `["analytics_db", "public", "customers"]` and `["analytics_db", "public", "orders"]`, which match the table format `database.schema.table`.
195+
196+
**Example 2: Stripping Multiple Prefixes**
197+
198+
```python
199+
# Asset keys with project and environment prefixes
200+
@asset(key=["my_project", "staging", "warehouse", "raw", "users"])
201+
def users():
202+
...
203+
```
204+
205+
**Configuration:**
206+
```yaml
207+
sourceConfig:
208+
config:
209+
stripAssetKeyPrefix: 2 # Remove first two segments ("my_project", "staging")
210+
```
211+
212+
**Result:** Asset key becomes `["warehouse", "raw", "users"]`, matching `warehouse.raw.users` in OpenMetadata.
213+
145214
**2. Assets Include Table Metadata in Materializations**
146215

147216
If your assets don't use database-style keys, you can still get lineage by including table metadata when materializing:

content/v1.11.x/connectors/pipeline/dagster/yaml.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,26 @@ This is a sample config for Dagster:
7575
**timeout** : Connection Time Limit Between OM and Dagster Graphql API in second
7676
{% /codeInfo %}
7777

78+
{% codeInfo srNumber=4 %}
79+
80+
**stripAssetKeyPrefix**: Number of leading segments to remove from asset key paths before resolving to tables.
81+
82+
Dagster asset keys are path-like identifiers represented as arrays of strings (e.g., `["project", "environment", "schema", "table"]`). When OpenMetadata ingests Dagster pipelines, it tries to match these asset keys to table entities using the standard format: `database.schema.table` or `schema.table`.
83+
84+
If your Dagster asset keys include additional prefix segments beyond the database/schema/table hierarchy, use this setting to strip those prefixes. For example:
85+
- Asset key: `["project", "environment", "schema", "table"]`
86+
- Set value to `2` to strip `project` and `environment`
87+
- Result: `schema.table` (matches OpenMetadata table entities)
88+
89+
Common use cases include stripping project/workspace identifiers, environment names (dev/staging/prod), or storage bucket/container prefixes.
90+
91+
Default value is `0` (no stripping).
92+
93+
{% /codeInfo %}
94+
7895
#### Source Configuration - Lineage
7996

80-
{% codeInfo srNumber=4 %}
97+
{% codeInfo srNumber=5 %}
8198

8299
**lineageInformation**: Configure lineage extraction settings.
83100

@@ -120,6 +137,9 @@ source:
120137
# timeout: 1000
121138
```
122139
```yaml {% srNumber=4 %}
140+
# stripAssetKeyPrefix: 0
141+
```
142+
```yaml {% srNumber=5 %}
123143
sourceConfig:
124144
config:
125145
type: PipelineMetadata

content/v1.11.x/main-concepts/metadata-standard/schemas/entity/services/connections/pipeline/dagsterConnection.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ slug: /main-concepts/metadata-standard/schemas/entity/services/connections/pipel
1313
- **`host`** *(string)*: URL to the Dagster instance.
1414
- **`token`** *(string)*: To Connect to Dagster Cloud.
1515
- **`timeout`** *(integer)*: Connection Time Limit Between OM and Dagster Graphql API in second. Default: `1000`.
16+
- **`stripAssetKeyPrefix`** *(integer)*: Number of leading segments to remove from asset key paths before resolving to tables. Dagster asset keys are path-like identifiers (e.g., `["project", "environment", "schema", "table"]`). Use this setting to strip prefix segments beyond the database/schema/table hierarchy. Default: `0`.
1617
- **`pipelineFilterPattern`**: Regex exclude pipelines. Refer to *../../../../type/filterPattern.json#/definitions/filterPattern*.
1718
- **`supportsMetadataExtraction`**: Refer to *../connectionBasicType.json#/definitions/supportsMetadataExtraction*.
1819
## Definitions

content/v1.12.x-SNAPSHOT/connectors/pipeline/dagster/index.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,28 @@ For a complete guide on managing secrets in hybrid setups, see the [Hybrid Inges
8383
- Give your API key a name and click on the "Create API Key" button.
8484
- Copy the generated API key to your clipboard and paste it in the field.
8585

86+
### Strip Asset Key Prefix $(id="stripAssetKeyPrefix")
87+
88+
Number of leading segments to remove from asset key paths before resolving to tables.
89+
90+
**About Dagster Asset Keys:**
91+
92+
Dagster asset keys are path-like identifiers represented as arrays of strings (e.g., `["project", "environment", "schema", "table"]`). When OpenMetadata ingests Dagster pipelines, it tries to match these asset keys to table entities using the standard format: `database.schema.table` or `schema.table`.
93+
94+
**When to Use This Setting:**
95+
96+
If your Dagster asset keys include additional prefix segments beyond the database/schema/table hierarchy, use this setting to strip those prefixes. For example:
97+
- Asset key: `["project", "environment", "schema", "table"]`
98+
- Set value to `2` to strip `project` and `environment`
99+
- Result: `schema.table` (matches OpenMetadata table entities)
100+
101+
Common use cases include stripping:
102+
- Project/workspace identifiers
103+
- Environment names (dev/staging/prod)
104+
- Storage bucket/container prefixes
105+
106+
Default value is `0` (no stripping).
107+
86108
{% /extraContent %}
87109

88110
{% partial file="/v1.12/connectors/test-connection.md" /%}
@@ -142,6 +164,53 @@ def customer_orders():
142164
| `["schema", "table"]` | Schema and table only |
143165
| `["table"]` | Table name only |
144166

167+
**Using stripAssetKeyPrefix for Asset Keys with Prefixes**
168+
169+
If your asset keys include additional prefix segments (e.g., project name, environment), use the `stripAssetKeyPrefix` configuration to remove them before matching to tables:
170+
171+
**Example 1: Stripping Environment Prefix**
172+
173+
```python
174+
# Your Dagster asset keys include environment prefix
175+
@asset(key=["prod", "analytics_db", "public", "customers"])
176+
def customers():
177+
...
178+
179+
@asset(
180+
key=["prod", "analytics_db", "public", "orders"],
181+
deps=[customers]
182+
)
183+
def orders():
184+
...
185+
```
186+
187+
**Configuration:**
188+
```yaml
189+
sourceConfig:
190+
config:
191+
stripAssetKeyPrefix: 1 # Remove the first segment ("prod")
192+
```
193+
194+
**Result:** Asset keys become `["analytics_db", "public", "customers"]` and `["analytics_db", "public", "orders"]`, which match the table format `database.schema.table`.
195+
196+
**Example 2: Stripping Multiple Prefixes**
197+
198+
```python
199+
# Asset keys with project and environment prefixes
200+
@asset(key=["my_project", "staging", "warehouse", "raw", "users"])
201+
def users():
202+
...
203+
```
204+
205+
**Configuration:**
206+
```yaml
207+
sourceConfig:
208+
config:
209+
stripAssetKeyPrefix: 2 # Remove first two segments ("my_project", "staging")
210+
```
211+
212+
**Result:** Asset key becomes `["warehouse", "raw", "users"]`, matching `warehouse.raw.users` in OpenMetadata.
213+
145214
**2. Assets Include Table Metadata in Materializations**
146215

147216
If your assets don't use database-style keys, you can still get lineage by including table metadata when materializing:

content/v1.12.x-SNAPSHOT/connectors/pipeline/dagster/yaml.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,26 @@ This is a sample config for Dagster:
7575
**timeout** : Connection Time Limit Between OM and Dagster Graphql API in second
7676
{% /codeInfo %}
7777

78+
{% codeInfo srNumber=4 %}
79+
80+
**stripAssetKeyPrefix**: Number of leading segments to remove from asset key paths before resolving to tables.
81+
82+
Dagster asset keys are path-like identifiers represented as arrays of strings (e.g., `["project", "environment", "schema", "table"]`). When OpenMetadata ingests Dagster pipelines, it tries to match these asset keys to table entities using the standard format: `database.schema.table` or `schema.table`.
83+
84+
If your Dagster asset keys include additional prefix segments beyond the database/schema/table hierarchy, use this setting to strip those prefixes. For example:
85+
- Asset key: `["project", "environment", "schema", "table"]`
86+
- Set value to `2` to strip `project` and `environment`
87+
- Result: `schema.table` (matches OpenMetadata table entities)
88+
89+
Common use cases include stripping project/workspace identifiers, environment names (dev/staging/prod), or storage bucket/container prefixes.
90+
91+
Default value is `0` (no stripping).
92+
93+
{% /codeInfo %}
94+
7895
#### Source Configuration - Lineage
7996

80-
{% codeInfo srNumber=4 %}
97+
{% codeInfo srNumber=5 %}
8198

8299
**lineageInformation**: Configure lineage extraction settings.
83100

@@ -120,6 +137,9 @@ source:
120137
# timeout: 1000
121138
```
122139
```yaml {% srNumber=4 %}
140+
# stripAssetKeyPrefix: 0
141+
```
142+
```yaml {% srNumber=5 %}
123143
sourceConfig:
124144
config:
125145
type: PipelineMetadata

content/v1.12.x-SNAPSHOT/main-concepts/metadata-standard/schemas/entity/services/connections/pipeline/dagsterConnection.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ slug: /main-concepts/metadata-standard/schemas/entity/services/connections/pipel
1313
- **`host`** *(string)*: URL to the Dagster instance.
1414
- **`token`** *(string)*: To Connect to Dagster Cloud.
1515
- **`timeout`** *(integer)*: Connection Time Limit Between OM and Dagster Graphql API in second. Default: `1000`.
16+
- **`stripAssetKeyPrefix`** *(integer)*: Number of leading segments to remove from asset key paths before resolving to tables. Dagster asset keys are path-like identifiers (e.g., `["project", "environment", "schema", "table"]`). Use this setting to strip prefix segments beyond the database/schema/table hierarchy. Default: `0`.
1617
- **`pipelineFilterPattern`**: Regex exclude pipelines. Refer to *../../../../type/filterPattern.json#/definitions/filterPattern*.
1718
- **`supportsMetadataExtraction`**: Refer to *../connectionBasicType.json#/definitions/supportsMetadataExtraction*.
1819
## Definitions

0 commit comments

Comments
 (0)