[FLINK-39813][Kinesis/Connector] Add lineage support to Kinesis Streams connector#250
Open
fmorillo7694 wants to merge 1 commit into
Open
[FLINK-39813][Kinesis/Connector] Add lineage support to Kinesis Streams connector#250fmorillo7694 wants to merge 1 commit into
fmorillo7694 wants to merge 1 commit into
Conversation
Implements LineageVertexProvider interface on KinesisStreamsSource and
KinesisStreamsSink to enable automatic lineage extraction in Flink 2.x.
The LineageGraph API in Flink 2.0+ allows the table planner and
OpenLineage integration to automatically discover input/output datasets
with their metadata. This change enables lineage for both DataStream
and SQL/Table API usage of the Kinesis connector.
New classes:
- KinesisLineageUtil: namespace/name extraction from stream ARN
- KinesisDatasetFacet: stream ARN, name, region metadata
- TypeDatasetFacet: schema information via TypeInformation
Namespace format uses full ARN prefix for governance specificity:
arn:{partition}:kinesis:{region}:{account}:stream
Dataset name is the stream name extracted from the ARN.
Covers DataStream API (direct), SQL/Table API (via KinesisDynamicSource
which internally creates KinesisStreamsSource), and the sink path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements LineageVertexProvider interface on KinesisStreamsSource and KinesisStreamsSink to enable automatic lineage extraction in Flink 2.x.
The LineageGraph API in Flink 2.0+ allows the table planner and OpenLineage integration to automatically discover input/output datasets with their metadata. This change enables lineage for both DataStream and SQL/Table API usage of the Kinesis connector.
New classes:
Namespace format uses full ARN prefix for governance specificity:
arn:{partition}:kinesis:{region}:{account}:stream
Dataset name is the stream name extracted from the ARN.
Covers DataStream API (direct), SQL/Table API (via KinesisDynamicSource which internally creates KinesisStreamsSource), and the sink path.
Purpose of the change
For example: Implements the Table API for the Kinesis Source.
Verifying this change
Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Significant changes
(Please check any boxes [x] if the answer is "yes". You can first publish the PR and check them afterwards, for convenience.)
@Public(Evolving))