Skip to content

Conversation

@dybyte
Copy link
Contributor

@dybyte dybyte commented Dec 28, 2025

Fixes: #10181

Purpose of this pull request

Background

Currently, SeaTunnel relies directly on Hazelcast IMap for internal state management. This tight coupling makes it difficult to:

  • Replace the underlying storage engine (e.g., switching from Hazelcast IMap to RocksDB)
  • Clearly define the responsibilities of state management components

Proposal

Introduce a StateStore abstraction to encapsulate all operations currently performed on Hazelcast IMap, along with a StateStoreFactory to create instances of StateStore.

Components

  • StateStore: Abstracts all methods of Hazelcast IMap used internally in SeaTunnel.
  • StateStoreFactory: Responsible for creating StateStore instances.
  • DistributedStoreManager: Creates and provides the appropriate StateStore via the NodeEngine.
  • As a result, each service no longer interacts with IMap directly. Instead, it passes the NodeEngine to the DistributedStoreManager to obtain a StateStore.

This design allows future replacement of Hazelcast IMap with RocksDB or any other storage engine by simply implementing StateStore and StateStoreFactory for the new engine. Additionally, creating a storage implementation using RocksDB (or others) becomes clear and straightforward.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Covered by existing tests.

Check list

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a StateStore abstraction to decouple SeaTunnel from direct Hazelcast IMap dependencies, making it easier to swap out storage backends (e.g., to RocksDB) in the future.

Key changes:

  • Introduces StateStore interface and StateStoreFactory to abstract storage operations
  • Implements HazelcastStateStore and HazelcastStateStoreFactory as the current Hazelcast-backed implementation
  • Adds DistributedStoreManager to centralize StateStore instance creation
  • Migrates all IMap usages throughout the codebase to use StateStore abstraction
  • Updates variable names, method names, and comments from "IMap" terminology to "StateStore" terminology

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
StateStore.java New interface defining storage abstraction with map-like operations
StateStoreFactory.java Factory interface for creating StateStore instances
HazelcastStateStore.java Hazelcast IMap implementation of StateStore interface
HazelcastStateStoreFactory.java Factory for creating HazelcastStateStore instances
DistributedStoreManager.java Manager class that provides StateStore instances via factory
JobMaster.java Updated to use StateStore instead of IMap for job state management
CoordinatorService.java Migrated IMap fields to StateStore for coordinator state
CheckpointCoordinator.java Updated checkpoint state storage to use StateStore
SeaTunnelServer.java Changed metrics storage from IMap to StateStore
JobHistoryService.java Updated job history storage to use StateStore abstraction
PhysicalPlan.java, SubPlan.java, PhysicalVertex.java Updated physical execution plan state storage to StateStore
Multiple test files Updated tests to use StateStore and DistributedStoreManager
AbstractSeaTunnelServerTest.java Added DistributedStoreManager field for test infrastructure
BaseService.java, JobInfoService.java Updated REST service layer to use StateStore
SharedConnectorJarStorageStrategy.java Migrated connector JAR ref counting to StateStore

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dybyte and others added 10 commits December 30, 2025 13:54
…che/seatunnel/engine/server/CoordinatorServiceTest.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/CoordinatorServiceTest.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/checkpoint/CheckpointManagerTest.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/dag/physical/SubPlan.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/master/JobHistoryService.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/master/JobMaster.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/execution/TaskExecutionContext.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/dag/physical/SubPlan.java

Co-authored-by: Copilot <[email protected]>
…che/seatunnel/engine/server/CoordinatorServiceTest.java

Co-authored-by: Copilot <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 31 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@davidzollo davidzollo requested a review from Hisoka-X December 30, 2025 10:35
dybyte added 2 commits January 5, 2026 23:05
…e/imap-interface

# Conflicts:
#	seatunnel-engine/seatunnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/checkpoint/IMapCheckpointIDCounter.java
@dybyte
Copy link
Contributor Author

dybyte commented Jan 5, 2026

Hi @zhangshenghang , PTAL when you have a time. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improve][Zeta] Introduce MapStorage interface to decouple from Hazelcast IMap

1 participant