Skip to content

Conversation

@pratapaditya04
Copy link
Contributor

@pratapaditya04 pratapaditya04 commented Sep 25, 2025

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

Add Dataset State and Shared Resource Broker Support to GenerateWorkUnitsImpl
Summary
This PR adds dataset state functionality to GenerateWorkUnitsImpl to match the MR job launcher pattern, enabling work units to access previous watermarks during work discovery for incremental data processing.
Why This Change Is Needed
The Temporal GenerateWorkUnitsImpl was missing critical functionality that exists in MR job launcher:
Missing Watermark Access:
No Shared Resource Broker: Job state lacked proper resource management and broker initialization
Work Discovery: Work units were generated without considering previous watermarks, breaking incremental ingestion
Changes Made
Added addDatasetStateFunctionalAndSharedResourceBrokerToJobState method - mirrors MR job launcher setup

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Added unit tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@pratapaditya04 pratapaditya04 changed the title enabled access to datastate store during work discovery [GOBBLIN-2229]enable access to datastate store during work discovery Sep 25, 2025
@abhishekmjain abhishekmjain merged commit 20731d5 into apache:master Sep 26, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants