Skip to content

Force merge behaviour unreliable #102594

Open
@EgbertW

Description

@EgbertW

Elasticsearch Version

8.10.3

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 5.15.107+ #1 SMP x86_64 x86_64 x86_64 GNU/Linux

Problem Description

I monitor an index that relies on force merging to 1 segment for performance considerations. The process is:

  1. Create index, replicas set to 0
  2. Bulk insert millions of documents into it
  3. Refresh index
  4. Force merge with max_num_segments set to 1, wait_for_completion=false
  5. Use task API to read task until task is done
  6. Use stats API to get the number of primary segments
  7. If primary segments is more than the number of shards, go back to step 4 and repeat
  8. Set number_of_replicas to 5, wait for replication to complete
  9. Start using index for querying

Multiple concurrent processes like this can run on the same cluster. We've noticed on multiple occassions that step 3&4 is not enough: the task completes successfully but still more than 1 segments exist for 1 or more shards. So we added step 6 to retry if that happens.

A new issue surfaced recently: even step 6 was not good enough. The stats API reported a number of primary segments equal to the amount of shards for several minutes and replication was initiated. During the allocation of additional replicas, the number of segments jumped back up to 44. Essentially, the force merge on one shard was reversed.

This mostly seems to happen when multiple indices are force merged at the same time. We noticed log messages like:

now throttling indexing: numMergesInFlight=10, maxNumMerges=9
stop throttling indexing: numMergesInFlight=8, maxNumMerges=9

This message appears to be a red flag - it looks like not only indexing is throttled but also force merging is partially cancelled.

This all feels like a bug somewhere in the force merge administration - the task is not monitored properly and not guaranteed to complete, and even if if does appear to complete correctly, there looks to be a change that the force merge is reversed.

Steps to Reproduce

  • Have a Elasticsearch cluster, set force_merge threadpool size to X=8
  • Create 2 separate indices with 0 replicas, with S=6 shards each (where 2xS is more than X=8)
  • Bulk insert millions of documents into both of them, concurrently (enough to be sure the force merge takes a significant amount of time)
  • As soon as all documents are indexed into an index, trigger a refresh on it
  • Directly after the refresh, initate a force merge with max_num_segments set to 1 (do this separately for both indices)
  • Use the task API to monitor the resulting tasks for completion
  • Initiate replicaton on the indices as soon as the tasks are done
  • Monitor the amount of primary shards throughout the entire process

Rinse & repeat a couple of times if necessary. Observe that the number of primary segments is not guaranteed to be 12 (6 on each index, 1 on each shard) after this.

The only lead we currently have is this: https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-forcemerge.html

that states:

Shards that are relocating during a forcemerge will not be merged.

Theoretically, the replication of one index could cause a rebalancing for primary shards on the other, effectively nullifying the effect of the force merge that is still ongoing.

It would be great to have some reliable grip on the process and being sure that it completed correctly.

Logs (if relevant)

now throttling indexing: numMergesInFlight=10, maxNumMerges=9
stop throttling indexing: numMergesInFlight=8, maxNumMerges=9

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.>docsGeneral docs changesTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.Team:DocsMeta label for docs team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions