Skip to content

Discrepancy in building DatasetVersionIds and which version is used #1977

@collado-mike

Description

@collado-mike

In the OpenLineageService, we construct a DatasetVersionId using the uuid property of the record - that is, the primary key. However, in the RunDao, when we construct the DatasetVersionId of the inputs and outputs of a Run, we use the version property. This leads to confusion, as code that depends on the return value of the OpenLineageService will expect a DatasetVersionId that can be matched to the values returned by the RunDao and it turns out they never match.

OpenLineageService code: https://github.com/MarquezProject/marquez/blob/main/api/src/main/java/marquez/service/OpenLineageService.java#L208

RunDao code: https://github.com/MarquezProject/marquez/blob/main/api/src/main/java/marquez/db/RunDao.java#L85-L87

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Todo

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions