Skip to content

Update Custom Jobs (OpenLineage) docs #37278

Merged
larakulkarni1 merged 14 commits into
masterfrom
lara.kulkarni/updating-data-observability-open-lineage-docs
Jun 5, 2026
Merged

Update Custom Jobs (OpenLineage) docs #37278
larakulkarni1 merged 14 commits into
masterfrom
lara.kulkarni/updating-data-observability-open-lineage-docs

Conversation

@larakulkarni1

Copy link
Copy Markdown
Contributor

What does this PR do? What is the motivation?

Updates content/en/data_observability/jobs_monitoring/openlineage/_index.md to reflect the current state of the Custom Jobs (OpenLineage) product.

  • Restructured the page with step-by-step instructions for emitting custom OpenLineage events (START event, optional COMPLETE with datasets, verify in
    Datadog)
  • Added supported facets reference including JobTypeJobFacet with integration values, processingType, and jobType options
  • Added dataset naming conventions with platform-specific namespace and name formats
  • Added explanation of how to link custom job lineage to natively-integrated datasets

Merge instructions

Merge readiness:

  • Ready for merge

For Datadog employees:

Your branch name MUST follow the <name>/<description> convention and include the forward slash (/). Without this format, your pull request will not pass CI, the GitLab pipeline will not run, and you won't get a branch preview. Getting a branch preview makes it easier for us to check any issues with your PR, such as broken links.

If your branch doesn't follow this format, rename it or create a new branch and PR.

[6/5/2025] Merge queue has been disabled on the documentation repo. If you have write access to the repo, the PR has been reviewed by a Documentation team member, and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #documentation channel in Slack.

AI assistance

Used Claude Code for drafting and editing content, with manual review and corrections against internal docs.

Additional notes

larakulkarni1 and others added 4 commits June 4, 2026 15:55
… reference, and dataset naming conventions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@larakulkarni1 larakulkarni1 marked this pull request as ready for review June 4, 2026 20:11
@larakulkarni1 larakulkarni1 requested a review from a team as a code owner June 4, 2026 20:11
@larakulkarni1 larakulkarni1 changed the title Update Custom Jobs (OpenLineage) docs with step-by-step guide, facets… Update Custom Jobs (OpenLineage) docs Jun 4, 2026
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

larakulkarni1 and others added 4 commits June 4, 2026 16:25
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…itional

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@OliviaShoup OliviaShoup left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR! this is a strong restructure and makes the page more readable

left some inline comments. also one bigger thing to consider:

now there are no worked example with inputs/outputs. the page tells users to "include inputs and outputs in your event" for lineage edges, and the dataset-naming table explains the namespace/name formats, but the code examples had their inputs removed, so nothing actually demonstrates a dataset reference. the PR description mentions an "optional COMPLETE with datasets" example that doesn't appear on the page. maybe you can add a concrete snippet (a COMPLETE event, or an annotated inputs/outputs block) so readers have a model?


## Step 1: Send a `START` event

Choose a method to send OpenLineage events to Datadog. All examples use the same `runId` UUID throughout the run—generate one and keep it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datadog style avoids em dashes that join clauses able to stand alone:

Suggested change
Choose a method to send OpenLineage events to Datadog. All examples use the same `runId` UUID throughout the run—generate one and keep it.
Choose a method to send OpenLineage events to Datadog. All examples use the same `runId` UUID throughout the run. Generate one and keep it.


#### `integration` values

Use `custom` for custom jobs. The values below are used by Datadog's native integrations—using them for custom jobs may produce unexpected behavior. In particular, `SPARK` prevents span generation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking for em dash

Suggested change
Use `custom` for custom jobs. The values below are used by Datadog's native integrations—using them for custom jobs may produce unexpected behavior. In particular, `SPARK` prevents span generation.
Use `custom` for custom jobs. The values below are used by Datadog's native integrations. Using them for custom jobs may produce unexpected behavior. In particular, `SPARK` prevents span generation.

## Prerequisites

- A Datadog API key. See [API and Application Keys][6].
- Your Datadog [site URL][3]. The examples on this page use `datadoghq.com`. Replace the hostname in the examples with the intake endpoint for your site. To find your site, see [Getting started with Datadog sites][3].

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[3] is linked twice in this one bullet ("site URL" and "Getting started with Datadog sites" both point to the same page). we can just link once like this:

Suggested change
- Your Datadog [site URL][3]. The examples on this page use `datadoghq.com`. Replace the hostname in the examples with the intake endpoint for your site. To find your site, see [Getting started with Datadog sites][3].
- Your Datadog [site URL][3]. The examples on this page use `datadoghq.com`; replace the hostname with the intake endpoint for your site.


```shell
export DD_API_KEY=your-datadog-api-key
export DD_API_KEY=<YOUR_API_KEY>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking this for consistency (the curl and Python examples use <DD_API_KEY>, but this one uses <YOUR_API_KEY>)

Suggested change
export DD_API_KEY=<YOUR_API_KEY>
export DD_API_KEY=<DD_API_KEY>


| Facet | What Datadog does |
|---|---|
| parent | Creates parent-child job hierarchy in the lineage graph |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code-formatting for consistency

Suggested change
| parent | Creates parent-child job hierarchy in the lineage graph |
| `parent` | Creates parent-child job hierarchy in the lineage graph |

Same for the rows below (errorMessage, tags, sql).

| Facet | What Datadog does |
|---|---|
| parent | Creates parent-child job hierarchy in the lineage graph |
| errorMessage | Generates error spans with `error.message` and `error.stack` tags |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| errorMessage | Generates error spans with `error.message` and `error.stack` tags |
| `errorMessage` | Generates error spans with `error.message` and `error.stack` tags |

|---|---|
| parent | Creates parent-child job hierarchy in the lineage graph |
| errorMessage | Generates error spans with `error.message` and `error.stack` tags |
| tags | Adds span tags to the run; `_dd.ol_service` value maps to the Datadog service name |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| tags | Adds span tags to the run; `_dd.ol_service` value maps to the Datadog service name |
| `tags` | Adds span tags to the run; `_dd.ol_service` value maps to the Datadog service name |

| parent | Creates parent-child job hierarchy in the lineage graph |
| errorMessage | Generates error spans with `error.message` and `error.stack` tags |
| tags | Adds span tags to the run; `_dd.ol_service` value maps to the Datadog service name |
| sql | Parses and masks the SQL query; generates query events |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| sql | Parses and masks the SQL query; generates query events |
| `sql` | Parses and masks the SQL query; generates query events |

larakulkarni1 and others added 5 commits June 4, 2026 17:31
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s/outputs are optional

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@larakulkarni1

Copy link
Copy Markdown
Contributor Author

Hi @OliviaShoup! I made the changes you requested + added in example inputs/outputs. I also found a few more things I wanted to change/add so made those as well. Please lmk your thoughts!

@OliviaShoup OliviaShoup left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for addressing the feedback so fast! looks great :) approving with one tiny style comment

| Value | Platform |
|---|---|
| `custom` | Custom or unsupported platforms |
| `SPARK` | Apache Spark (native integration only—do not use for custom jobs) |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one remaining em dash (it was fixed in the prose sections but missed in this table cell)

Suggested change
| `SPARK` | Apache Spark (native integration onlydo not use for custom jobs) |
| `SPARK` | Apache Spark (native integration only; do not use for custom jobs) |

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@larakulkarni1 larakulkarni1 merged commit c3a9a68 into master Jun 5, 2026
16 checks passed
@larakulkarni1 larakulkarni1 deleted the lara.kulkarni/updating-data-observability-open-lineage-docs branch June 5, 2026 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants