Skip to content

Addition of virtual _hoodie_commit_completion_time column which is used for Incremental Queries #14036

@vamshikrishnakyatham

Description

@vamshikrishnakyatham

Bug Description

What happened:
Filtering for Hudi incremental queries on tables with version >= 6 depends on commit completion time which is not present in the output for the users. Filtering with requested times (currently outputted) gives inconsistent and wrong results. It is difficult for the users to give proper times as part of the filter if it is not outputted.

What you expected:
Implement a completion time virtual column (_hoodie_commit_completion_time) for Hudi incremental queries on tables with version >= 6. This virtual column is dynamically added during query execution and provides completion time semantics for filtering and display.

Steps to reproduce:

  1. Create a Hudi Table with incremental query support
  2. Performing Incremental Query with requested times (which is outputted currently) gives inconsistent and wrong results.

Environment

Hudi version: 1.1
Query engine: (Spark/Flink/Trino etc)
Relevant configs:

Logs and Stack Trace

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:bugFor Issues and PRs that fix bugs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions