Open
Description
Observed in the wild in a cluster state that took ~5s to apply. There were several GENERIC
threads all performing recoveries and running here:
app/[email protected]/org.elasticsearch.index.shard.IndexShard.createEngine(IndexShard.java:2146)
app/[email protected]/org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:2119)
app/[email protected]/org.elasticsearch.index.shard.IndexShard.openEngineAndSkipTranslogRecovery(IndexShard.java:2095)
app/[email protected]/org.elasticsearch.indices.recovery.PeerRecoveryTargetService.lambda$doRecovery$2(PeerRecoveryTargetService.java:310)
app/[email protected]/org.elasticsearch.indices.recovery.PeerRecoveryTargetService$$Lambda/0x000000c802a52458.apply(Unknown Source)
app/[email protected]/org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:89)
app/[email protected]/org.elasticsearch.action.ActionListenerImplementations$DelegatingResponseActionListener.onResponse(ActionListenerImplementations.java:182)
app/[email protected]/org.elasticsearch.index.CompositeIndexEventListener.callListeners(CompositeIndexEventListener.java:290)
app/[email protected]/org.elasticsearch.index.CompositeIndexEventListener.lambda$callListeners$10(CompositeIndexEventListener.java:284)
app/[email protected]/org.elasticsearch.index.CompositeIndexEventListener$$Lambda/0x000000c802a75000.accept(Unknown Source)
app/[email protected]/org.elasticsearch.action.ActionRunnable$4.doRun(ActionRunnable.java:100)
app/[email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
app/[email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
[email protected]/java.lang.Thread.runWith(Thread.java:1596)
[email protected]/java.lang.Thread.run(Thread.java:1583)
Meanwhile the applier thread appears to have been trying to close one of the recovering shards and was blocked here:
0.0% [cpu=0.0%, other=0.0%] (0s out of 500ms) cpu usage by thread 'elasticsearch[es-es-search-74ff455b96-m8c5k][clusterApplierService#updateTask][T#1]'
10/10 snapshots sharing following 21 elements
app/[email protected]/org.elasticsearch.index.shard.IndexShard.close(IndexShard.java:1724)
app/[email protected]/org.elasticsearch.index.IndexService.lambda$closeShard$16(IndexService.java:651)
app/[email protected]/org.elasticsearch.index.IndexService$$Lambda/0x000000c802b8fdd0.accept(Unknown Source)
app/[email protected]/org.elasticsearch.action.ActionListener.run(ActionListener.java:433)
app/[email protected]/org.elasticsearch.index.IndexService.closeShard(IndexService.java:629)
app/[email protected]/org.elasticsearch.index.IndexService.removeShard(IndexService.java:589)
app/[email protected]/org.elasticsearch.indices.cluster.IndicesClusterStateService.removeIndicesAndShards(IndicesClusterStateService.java:499)
app/[email protected]/org.elasticsearch.indices.cluster.IndicesClusterStateService.doApplyClusterState(IndicesClusterStateService.java:302)
app/[email protected]/org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:260)
app/[email protected]/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:544)
app/[email protected]/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:530)
app/[email protected]/org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:503)
app/[email protected]/org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
app/[email protected]/org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:156)
app/[email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
app/[email protected]/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:217)
app/[email protected]/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:183)
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
[email protected]/java.lang.Thread.runWith(Thread.java:1596)
[email protected]/java.lang.Thread.run(Thread.java:1583)
Relates #89821