Skip to content

Feature Request: Custom Dataplex Entry Type #18041

Description

@zkhazaei

Hi,
What we are looking for is a generic mechanism that allows custom Dataplex entry types to pass through the DataHub ingestion pipeline without being rejected.Currently, in DataHub 1.5.0.x, it appears that only entry types defined in DATAPLEX_ENTRY_TYPE_MAPPINGS are accepted, while other custom entry types are filtered out before they can reach later ingestion phases.
Our use case is not tied to a single custom entry type. We have multiple custom Dataplex entry types, for example:

• Contract.
• Report.
• Dashboard.
• Potentially other business-specific asset types in the future.

So goal is not to introduce new native DataHub entity models. We don't need DataHub to add support for concepts such as Contract or Report as first-class entities. Instead, we would like custom Dataplex entry types to be preserved and passed through the ingestion pipeline so that they can be handled by custom transformers. For example, something like a generic CustomEntryType that includes a field specifying the actual type (e.g. Contract, Report, Dashboard, etc.) .
Also, a field for the custom data model would be useful, potentially containing the raw data payload. This would allow custom entry types to pass through the ingestion pipeline and be transformed later by a custom transformer into DataHub-native entities such as Containers, Data Products, Datasets, Domains, etc.
In our case, the flow would be:

  1. Read a custom Dataplex entry type..
  2. Allow it to pass through source processing and ingestion pipeline phases..
  3. Process raw custom payload it in a custom transformer..
  4. Convert it into existing DataHub-native entities such as Containers, Data Products, Datasets, Domains, etc..
    This would give users the flexibility to implement their own mapping logic without requiring DataHub to support every possible custom Dataplex asset type.
    Additionally, it would be helpful to understand:

• Whether this is something the team would consider adding to DataHub..
• If so, whether there is any rough timeline or priority estimate. This would help us decide whether we should wait for native support or continue investing in a custom solution..
• What exactly you mean by "contribution". Are you referring to creating and maintaining a community PR that implements support for custom Dataplex entry types, or is there another approach you would recommend?.
Thanks again, and we look forward to discussing this further.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions