Description
Elasticsearch Version
Version: 8.7.1, Build: rpm/f229ed3f893a515d590d0f39b05f68913e2d9b53/2023-04-27T04:33:42.127815583Z, JVM: 20.0.1
Installed Plugins
discovery-ec2
Java Version
bundled
OS Version
Linux 6.1.59-84.139.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Oct 24 20:57:25 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
We had an index containing many docs.deleted
. In order to clean up the index it was restored on a cluster where no read/writes were happening on the index. After that, we ran a _forcemerge
with only_expunge_deletes
as true on the index to optimize it. Before the force merge operation was started, SLM took one snapshot of the index to s3.
The force merge operation optimized the index and reduced the number of segments from 5k to 1.5k. However, when we took a fresh snapshot of the index and restored it, the index didn't match the state after the force merge. The restored index still had the state that was before the force merge operation. On checking the snapshot details we also observed that the incremental snapshot taken after the force merge operation finished in 800 ms despite the fact that all underlying segment files had changed.
Steps to Reproduce
- Create an index and add, and delete documents on it. Make sure the docs.deleted count on the index reaches a very high number.
- Stop all reads/writes to the index.
- Change the
index.merge.policy.expunge_deletes_allowed
setting to 1. - Take a snapshot of the index to s3.
- Run a _forcemerge operation with
only_expunge_deletes
set to true. - docs.deleted count and number of segments on the index should reduce after
_forcemerge
. - Take a new snapshot of the index.
- Restore the index from the snapshot taken in step 7.
- The restored index still has the state of the index before _forcemerge operation.
Logs (if relevant)
No response