feat(ingestion): add MarkDeprecated transformer#18000
Open
rospe wants to merge 1 commit into
Open
Conversation
Adds a recipe-configurable transformer for setting the deprecation aspect on entities flowing through the pipeline. Supports OVERWRITE and PATCH semantics, decommission_time (defaults to now), replacement URN, and URN-based filtering. Supported entity types: dataset, chart, dashboard, dataFlow, dataJob, container.
Contributor
|
Linear: ING-2900 Thanks for your contribution! We have created an internal ticket to track this PR. A member of the core DataHub team will be assigned to review it within the next few business days - you will get a follow-up comment once a reviewer is assigned. |
Bundle ReportBundle size has no change ✅ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a built-in transformer for setting the
deprecationaspect on entities during ingestion. Currently there is no recipe-configurable way to mark assets as deprecated — users must usedatahub putone entity at a time or write a custom transformer.Motivation
When decommissioning data sources, teams need to bulk-deprecate datasets, dashboards, and other assets as part of their ingestion pipelines. This transformer makes that a single config block in any recipe.
Changes
New transformer:
mark_deprecateddeprecated,note,actor,replacement, anddecommissionTimeon entitiesdecommission_timedefaults to the current time at pipeline starturnsfilter: if populated, only matching entities are affected; if empty, all entities in the pipeline are markedOVERWRITE(default) andPATCHsemanticsPATCHmerges with existing server state — preserves the original note, actor, and decommission date if already set (useful for recurring pipelines where you want to keep the first deprecation timestamp)Entry points: registered as
mark_deprecatedin bothsetup.pyandpyproject.tomlDocs: added to the Universal Transformers page with config table and examples
Tests: unit tests covering OVERWRITE, PATCH, URN filtering, all entity types, and default decommission_time behavior
Example usage
Checklist