Skip to content

OpenLineage : Add core lineage service scaffolding#4705

Open
iting0321 wants to merge 8 commits into
apache:mainfrom
iting0321:ol-core-spi-domain-model-1
Open

OpenLineage : Add core lineage service scaffolding#4705
iting0321 wants to merge 8 commits into
apache:mainfrom
iting0321:ol-core-spi-domain-model-1

Conversation

@iting0321

@iting0321 iting0321 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

This PR introduces the core query/response models, a LineageService boundary, runtime service/config wiring, and disabled-by-default feature guards. It does not add persistence, ingest parsing, REST endpoints, RBAC, or forwarding.

Changes

  • Add polaris-core lineage query/response models for query requests, graph responses, nodes, node data, field mappings, direction,granularity, and node type in polaris-core/.../lineage/.
  • Add LineageService query boundary in polaris-core/.../lineage/.
  • Add runtime DefaultLineageService and LineageConfiguration in runtime/service/.../lineage/.
  • Add ENABLE_LINEAGE feature flag, defaulting to disabled, in polaris-core/.../config/FeatureConfiguration.java.
  • Add generated config docs for polaris.lineage.enabled in site/content/in-dev/unreleased/configuration/config-sections/.
  • Add unit tests for disabled guard behavior and enabled-but-not-implemented placeholder behavior in runtime/service/.../DefaultLineageServiceTest.java.

Out of Scope

  • Persistence SPI, persistence models, JDBC schema, and storage implementation.
  • OpenLineage ingest parsing and dataset/column edge extraction.
  • API endpoints.
  • RBAC privileges and authorization filtering.
  • Downstream forwarding and passthrough proxy behavior.

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Copilot AI review requested due to automatic review settings June 11, 2026 13:17
@github-project-automation github-project-automation Bot moved this to PRs In Progress in Basic Kanban Board Jun 11, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Introduces a new lineage service boundary and supporting models/configuration, gated behind static and realm feature flags, with a placeholder persistence implementation.

Changes:

  • Adds core lineage API models (ingest/query requests, graph/node/edge types) and SPI (LineagePersistence, LineageService).
  • Implements DefaultLineageService with enablement checks (static config + realm feature + persistence flag) and adds a disabled persistence placeholder.
  • Adds unit tests for enablement gating and persistence delegation, and introduces a new realm feature flag ENABLE_LINEAGE.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
runtime/service/src/test/java/org/apache/polaris/service/lineage/DefaultLineageServiceTest.java Adds unit coverage for lineage enablement checks and delegation to persistence.
runtime/service/src/main/java/org/apache/polaris/service/lineage/LineageConfiguration.java Adds SmallRye config mapping for lineage feature and persistence toggles.
runtime/service/src/main/java/org/apache/polaris/service/lineage/DisabledLineagePersistence.java Adds placeholder LineagePersistence that fails fast when invoked.
runtime/service/src/main/java/org/apache/polaris/service/lineage/DefaultLineageService.java Adds request-scoped lineage service implementation and enablement gates.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageService.java Defines a core service boundary for lineage operations.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageQueryRequest.java Adds request model for lineage graph queries.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineagePersistence.java Adds persistence SPI contract for lineage backends.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageNodeType.java Adds enum for node kinds in lineage graphs.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageNode.java Adds node model for lineage graph responses.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageIngestRequest.java Adds normalized ingest payload model.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageGraph.java Adds normalized lineage query response model.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageGranularity.java Adds enum for dataset vs column granularity queries.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageFieldReference.java Adds model for dataset field references used in column edges.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageFieldMapping.java Adds mapping model for column-granularity responses.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageEdge.java Adds dataset-to-dataset edge model.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageDirection.java Adds enum for query direction.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageDataset.java Adds dataset identity model for lineage.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageData.java Adds response metadata wrapper for datasets.
polaris-core/src/main/java/org/apache/polaris/core/lineage/LineageColumnEdge.java Adds column-level edge model.
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java Adds realm feature flag ENABLE_LINEAGE.

Comment on lines +27 to +31
public LineageGraph {
Objects.requireNonNull(node, "node must be non-null");
upstream = List.copyOf(upstream);
downstream = List.copyOf(downstream);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +32 to +36
public LineageNode {
Objects.requireNonNull(id, "id must be non-null");
Objects.requireNonNull(type, "type must be non-null");
fieldMappings = List.copyOf(fieldMappings);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +68 to +71
if (!callContext.getRealmConfig().getConfig(FeatureConfiguration.ENABLE_LINEAGE)) {
throw new UnsupportedOperationException(
"Feature not enabled: " + FeatureConfiguration.ENABLE_LINEAGE.key());
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@adnanhemani

adnanhemani commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Thanks for this @iting0321! I think we are trying to, however, do too much in this PR itself. Can we remove the Persistence-related models, as we may need a bit more time to close consensus on those bits? I understand that there will be no callers of these models as a result, but we will still need this in case of both persistence and passthrough models regardless. Given that nothing in here would be considered a "public interface" IMO, we should be ok to change it later down the line, if needed.

@iting0321

Copy link
Copy Markdown
Contributor Author

Thanks for this @iting0321! I think we are trying to, however, do too much in this PR itself. Can we remove the Persistence-related models, as we may need a bit more time to close consensus on those bits? I understand that there will be no callers of these models as a result, but we will still need this in case of both persistence and passthrough models regardless. Given that nothing in here would be considered a "public interface" IMO, we should be ok to change it later down the line, if needed.

I just removed the Persistence-related models (including ingest model).
This PR now only keeps the neutral lineage query/response models, the LineageService query boundary, runtime config/service wiring,the disabled-by-default ENABLE_LINEAGE guard, and the related tests/docs.
I also updated the PR description to explain what is out of scope. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants