Skip to content

[FLINK-39813][Kinesis/Connector] Add lineage support to Kinesis Streams connector#250

Open
fmorillo7694 wants to merge 1 commit into
apache:mainfrom
fmorillo7694:feature/kinesis-lineage-support
Open

[FLINK-39813][Kinesis/Connector] Add lineage support to Kinesis Streams connector#250
fmorillo7694 wants to merge 1 commit into
apache:mainfrom
fmorillo7694:feature/kinesis-lineage-support

Conversation

@fmorillo7694

Copy link
Copy Markdown

Implements LineageVertexProvider interface on KinesisStreamsSource and KinesisStreamsSink to enable automatic lineage extraction in Flink 2.x.

The LineageGraph API in Flink 2.0+ allows the table planner and OpenLineage integration to automatically discover input/output datasets with their metadata. This change enables lineage for both DataStream and SQL/Table API usage of the Kinesis connector.

New classes:

  • KinesisLineageUtil: namespace/name extraction from stream ARN
  • KinesisDatasetFacet: stream ARN, name, region metadata
  • TypeDatasetFacet: schema information via TypeInformation

Namespace format uses full ARN prefix for governance specificity:
arn:{partition}:kinesis:{region}:{account}:stream

Dataset name is the stream name extracted from the ARN.

Covers DataStream API (direct), SQL/Table API (via KinesisDynamicSource which internally creates KinesisStreamsSource), and the sink path.

Purpose of the change

For example: Implements the Table API for the Kinesis Source.

Verifying this change

Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment
  • Added unit tests
  • Manually verified by running the Kinesis connector on a local Flink cluster.

Significant changes

(Please check any boxes [x] if the answer is "yes". You can first publish the PR and check them afterwards, for convenience.)

  • Dependencies have been added or upgraded
  • Public API has been changed (Public API is any class annotated with @Public(Evolving))
  • Serializers have been changed
  • New feature has been introduced
    • If yes, how is this documented? (not applicable / docs / JavaDocs / not documented)

Implements LineageVertexProvider interface on KinesisStreamsSource and
KinesisStreamsSink to enable automatic lineage extraction in Flink 2.x.

The LineageGraph API in Flink 2.0+ allows the table planner and
OpenLineage integration to automatically discover input/output datasets
with their metadata. This change enables lineage for both DataStream
and SQL/Table API usage of the Kinesis connector.

New classes:
- KinesisLineageUtil: namespace/name extraction from stream ARN
- KinesisDatasetFacet: stream ARN, name, region metadata
- TypeDatasetFacet: schema information via TypeInformation

Namespace format uses full ARN prefix for governance specificity:
  arn:{partition}:kinesis:{region}:{account}:stream

Dataset name is the stream name extracted from the ARN.

Covers DataStream API (direct), SQL/Table API (via KinesisDynamicSource
which internally creates KinesisStreamsSource), and the sink path.
@fmorillo7694 fmorillo7694 changed the title [FLINK-39810][Kinesis/Connector] Add lineage support to Kinesis Streams connector [FLINK-39813][Kinesis/Connector] Add lineage support to Kinesis Streams connector Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants