generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 169
Open
Labels
FeaturesIntroduces a new unit of functionality that satisfies a requirementIntroduces a new unit of functionality that satisfies a requirement
Description
What is the bug?
Description
When there is a scenarios:
- There is A Merge Task On
Node1#Index1#Shard1
(long time running) - After merge task started, begin relocating from
Node1#Index1#Shard1
TONode2#Index1#Shard1
- At the finalize step, source need do closeShard, but the merge task would take a long time, stack as following shows.
- The clusterApplierService would wait for about N minutes(long time running), and mark the node stale, and master let node1 left because node1 long time no response.
opensearch[datanode1][clusterApplierService#updateTask][T#1]" #41 daemon prio=5 os_prio=0 cpu=5183.70ms elapsed=93132.85s tid=0x00007f3f392509d0 nid=0x101 in Object.wait() [0x00007f3f6ddfb000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait([email protected]/Native Method)
- waiting on <no object reference available>
at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5410)
- locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.abortMerges(IndexWriter.java:2721)
- locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2469)
- locked <0x0000001022b0abe8> (a org.apache.lucene.index.IndexWriter)
at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2449)
- locked <0x0000001022bae6d0> (a java.lang.Object)
at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2441)
at org.opensearch.index.engine.InternalEngine.closeNoLock(InternalEngine.java:2370)
at org.opensearch.index.engine.Engine.close(Engine.java:2000)
at org.opensearch.index.engine.Engine.flushAndClose(Engine.java:1987)
at org.opensearch.index.shard.IndexShard.close(IndexShard.java:1907)
- locked <0x0000001022b07ea0> (a java.lang.Object)
at org.opensearch.index.IndexService.closeShard(IndexService.java:623)
at org.opensearch.index.IndexService.removeShard(IndexService.java:599)
- locked <0x0000001022a976a8> (a org.opensearch.index.IndexService)
at org.opensearch.index.IndexService.close(IndexService.java:374)
- locked <0x0000001022a976a8> (a org.opensearch.index.IndexService)
at org.opensearch.indices.IndicesService.removeIndex(IndicesService.java:993)
at org.opensearch.indices.cluster.IndicesClusterStateService.removeIndices(IndicesClusterStateService.java:446)
at org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:287)
- locked <0x000000100b7da520> (a org.opensearch.indices.cluster.IndicesClusterStateService)
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:606)
at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:593)
PR #2529
Metadata
Metadata
Assignees
Labels
FeaturesIntroduces a new unit of functionality that satisfies a requirementIntroduces a new unit of functionality that satisfies a requirement