Skip to content

Deadlock occurs during shutdown #1178

Open
@funky-eyes

Description

@funky-eyes

Describe the bug

    public void destroy() {
        Optional.ofNullable(raftGroupService).ifPresent(r -> {
            r.shutdown();
            try {
                r.join();
            } catch (InterruptedException e) {
                logger.warn("Interrupted when RaftServer destroying", e);
            }
        });
    }

可以看到jraft的groupshutdown是启动一个新的线程去shutdown,此时NodeImpl的shutdown后也会去执行join方法

"JRaft-Group-Default-Executor-3" #182 [221814] daemon prio=5 os_prio=0 cpu=62441.04ms elapsed=3535428.38s tid=0x00007f4c0411f200 nid=221814 waiting on condition  [0x00007f4bde594000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x00000007017fb0f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:371)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:519)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3780)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3725)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1707)
	at com.alipay.sofa.jraft.util.CountDownEvent.await(CountDownEvent.java:69)
	at com.alipay.sofa.jraft.storage.snapshot.SnapshotExecutorImpl.join(SnapshotExecutorImpl.java:748)
	at com.alipay.sofa.jraft.core.NodeImpl.join(NodeImpl.java:2891)
	- locked <0x00000007017e6d30> (a com.alipay.sofa.jraft.core.NodeImpl)
	at com.alipay.sofa.jraft.core.NodeImpl.lambda$shutdown$7(NodeImpl.java:2837)
	at com.alipay.sofa.jraft.core.NodeImpl$$Lambda/0x00007f4c57804f80.run(Unknown Source)
	at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
	at java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:317)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

然后业务线程等待这个结果时,join方法是携带了synchronized,导致shutdownhook线程因为拿不到nodeimpl的锁导致一直hang住,应用无法下线

"SpringApplicationShutdownHook" #34 [442820] prio=5 os_prio=0 cpu=320.57ms elapsed=421.49s tid=0x00007f4bfc1ad980 nid=442820 waiting for monitor entry  [0x00007f4bc0946000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.alipay.sofa.jraft.core.NodeImpl.join(NodeImpl.java:2883)
	- waiting to lock <0x00000007017e6d30> (a com.alipay.sofa.jraft.core.NodeImpl)
	at com.alipay.sofa.jraft.RaftGroupService.join(RaftGroupService.java:148)
	- locked <0x0000000701d33790> (a com.alipay.sofa.jraft.RaftGroupService)
	at org.apache.seata.server.cluster.raft.RaftServer.lambda$destroy$0(RaftServer.java:130)
	at org.apache.seata.server.cluster.raft.RaftServer$$Lambda/0x00007f4c577bf0f8.accept(Unknown Source)
	at java.util.Optional.ifPresent([email protected]/Optional.java:178)
	at org.apache.seata.server.cluster.raft.RaftServer.destroy(RaftServer.java:127)
	at org.apache.seata.server.cluster.raft.RaftServer.close(RaftServer.java:122)
	at org.apache.seata.server.cluster.raft.RaftServerManager.lambda$destroy$1(RaftServerManager.java:166)
	at org.apache.seata.server.cluster.raft.RaftServerManager$$Lambda/0x00007f4c577beed0.accept(Unknown Source)
	at java.util.HashMap.forEach([email protected]/HashMap.java:1429)
	at org.apache.seata.server.cluster.raft.RaftServerManager.destroy(RaftServerManager.java:165)
	at org.apache.seata.server.session.SessionHolder.destroy(SessionHolder.java:415)
	at org.apache.seata.server.coordinator.DefaultCoordinator.destroy(DefaultCoordinator.java:684)
	at org.apache.seata.server.ServerRunner.destroy(ServerRunner.java:90)
	at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:213)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:587)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:559)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1163)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:520)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1156)
	at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1123)
	at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1089)
	at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.doClose(ServletWebServerApplicationContext.java:174)
	at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1035)
	- locked <0x0000000700cb6728> (a java.lang.Object)
	at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:145)
	at org.springframework.boot.SpringApplicationShutdownHook$$Lambda/0x00007f4c577b5538.accept(Unknown Source)
	at java.lang.Iterable.forEach([email protected]/Iterable.java:75)
	at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:114)
	at java.lang.Thread.runWith([email protected]/Thread.java:1596)
	at java.lang.Thread.run([email protected]/Thread.java:1583)

Expected behavior

Actual behavior

Steps to reproduce

Minimal yet complete reproducer code (or GitHub URL to code)

Environment

  • SOFAJRaft version:
  • JVM version (e.g. java -version):
  • OS version (e.g. uname -a):
  • Maven version:
  • IDE version:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions