generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 40
Closed
Labels
Description
Is your feature request related to a problem? Please describe.
Currently, the reindex-from-snapshot process will attempt to migrate all docs in a shard with one worker and will only mark the shard as completed once all docs are migrated in a single attempt. This increases migration time and risk for larger shards as the first docs in each shard may be retried several times before succeeding. The reliance on a long-running single worker for large shards increases risk of failure increasing the time to complete the migration.
Describe the solution you'd like
Implement the ability to regularly checkpoint and resume migration of shards, which limits the amount of duplicate times a doc is migrated particularly for large shards. Implement a ceiling on the duration a lease for a work-item can reach.
Describe alternatives you've considered
- We can upfront split up the shard into sub-shard work items that can be migrated in parallel, but this introduces complexity and increases unevenness of work distribution in the target cluster as when an index shard count remains the same during a migration, each sub-shard worker for a given source shard will hit a single node/shard in the target cluster.
Additional context
Jira Epic(s)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done