Skip to content

Incremental snapshot not working correctly after running forcemerge. #102395

Open
@merajblueshift

Description

@merajblueshift

Elasticsearch Version

Version: 8.7.1, Build: rpm/f229ed3f893a515d590d0f39b05f68913e2d9b53/2023-04-27T04:33:42.127815583Z, JVM: 20.0.1

Installed Plugins

discovery-ec2

Java Version

bundled

OS Version

Linux 6.1.59-84.139.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 24 20:57:25 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

We had an index containing many docs.deleted. In order to clean up the index it was restored on a cluster where no read/writes were happening on the index. After that, we ran a _forcemerge with only_expunge_deletes as true on the index to optimize it. Before the force merge operation was started, SLM took one snapshot of the index to s3.
The force merge operation optimized the index and reduced the number of segments from 5k to 1.5k. However, when we took a fresh snapshot of the index and restored it, the index didn't match the state after the force merge. The restored index still had the state that was before the force merge operation. On checking the snapshot details we also observed that the incremental snapshot taken after the force merge operation finished in 800 ms despite the fact that all underlying segment files had changed.

Steps to Reproduce

  1. Create an index and add, and delete documents on it. Make sure the docs.deleted count on the index reaches a very high number.
  2. Stop all reads/writes to the index.
  3. Change the index.merge.policy.expunge_deletes_allowed setting to 1.
  4. Take a snapshot of the index to s3.
  5. Run a _forcemerge operation with only_expunge_deletes set to true.
  6. docs.deleted count and number of segments on the index should reduce after _forcemerge.
  7. Take a new snapshot of the index.
  8. Restore the index from the snapshot taken in step 7.
  9. The restored index still has the state of the index before _forcemerge operation.

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.>bugTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions