Description
Ⅰ. Issue Description
TCC模式下,开启它的useTCCFence功能,mysql事务隔离级别是RR,如果prepare阶段发生悬挂 && rollback阶段也发生悬挂,会出现异常【Deadlock found when trying to get lock; try restarting transaction】
Ⅱ. Describe what happened
TCC模式下,mysql事务隔离级别是RR,如果prepare阶段发生悬挂 && rollback阶段也发生悬挂,因为rollback方法【org.apache.seata.rm.fence.SpringFenceHandler#rollbackFence】会重试,当rollback悬挂消失(prepare悬挂还未消失 or 比rollback慢一步执行)时,此时就可能出现多个请求同时执行rollback方法,这些请求会开启不同的本地事务, 每个本地事务都会执行一次【select ... for update】查询,由于此时prepare阶段还处于悬挂状态,所以表【tcc_fence_log】还没有该分支事务的fence记录,由于该分支事务的fence记录是不存在的,所以【select ... for update】查询会从行锁 退化成 间隙锁,由于不同事务是可以同时获取同一范围的间隙锁,所以这多个rollback请求都不会被阻塞,于是都去执行【insert】操作,在执行insert操作时,他们都需要等待彼此的间隙锁,于是发生了死锁!
2024-07-15 21:57:09.942 ERROR 86408 --- [h_RMROLE_1_1_24] io.seata.rm.AbstractResourceManager : rollback TCC resource error, resourceId: updateInventoryAcquire, xid: 10.244.137.109:8091:8944687993233979087.
io.seata.common.exception.StoreException: Deadlock found when trying to get lock; try restarting transaction
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
at io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
at io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
at io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
... 20 common frames omitted
2024-07-15 21:57:09.942 ERROR 86408 --- [h_RMROLE_1_3_24] io.seata.rm.AbstractResourceManager : rollback TCC resource error, resourceId: updateInventoryAcquire, xid: 10.244.137.109:8091:8944687993233979087.
io.seata.common.exception.StoreException: Deadlock found when trying to get lock; try restarting transaction
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
at io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
at io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
at io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
... 20 common frames omitted
2024-07-15 21:57:09.943 INFO 86408 --- [h_RMROLE_1_3_24] io.seata.rm.AbstractRMHandler : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2024-07-15 21:57:09.943 INFO 86408 --- [h_RMROLE_1_1_24] io.seata.rm.AbstractRMHandler : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2024-07-15 21:57:09.952 ERROR 86408 --- [h_RMROLE_1_5_24] io.seata.rm.AbstractResourceManager : rollback TCC resource error, resourceId: updateInventoryAcquire, xid: 10.244.137.109:8091:8944687993233979087.
io.seata.common.exception.StoreException: Deadlock found when trying to get lock; try restarting transaction
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:133)
at io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
at io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
at io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:128)
... 20 common frames omitted
Ⅲ. Describe what you expected to happen
多个rollback请求同时执行发生冲突时,应该出现【duplicate key exception】,避免出现死锁!
Ⅳ. Anything else we need to know?
我的想法(方案)有4个:
-
在rollback方法中,将【select .... for update】和【insert】操作调换下位置,先执行insert,再执行for update,避免间隙锁引起的死锁问题。(但是由于insert tcc fence操作一般都是在prepare阶段做的,prepare悬挂导致insert tcc fence操作转移到了rollback方法毕竟是少数,如果调换【select .... for update】和【insert】操作的位置,会导致每次rollback操作都需要执行两次sql操作,性能会降低,所以不推荐!)
-
使用redis等中间件做一个分布式锁,对【org.apache.seata.rm.fence.SpringFenceHandler】的【prepareFence】、【commitFence】、【rollbackFence】操作都需要获取分布式锁才能操作,这样也能避免死锁问题(但是这样会导致这三个操作都需要有两次网络io操作,性能也会降低,所以也不推荐!)
-
由于死锁是因为RR级别下的间隙锁造成的,那如果把事务隔离级别调低,换成RC,此时就没有间隙锁,自然也就不会产生死锁!于是有了如下方案:对【org.apache.seata.rm.fence.SpringFenceHandler】的【prepareFence】、【commitFence】、【rollbackFence】的db操作,是通过【org.springframework.transaction.support.TransactionTemplate#execute】实现的,所以只需要单独对这三个方法做一下改造,在执行这些方法时,临时把【TransactionTemplate】的事务隔离级别换成 RC,执行完后再换回来默认的事务隔离级别即可。
-
(推荐方案)由于死锁是因为RR级别下的间隙锁造成的,那如果把事务隔离级别调低,换成RC,此时就没有间隙锁,自然也就不会产生死锁!于是有了如下方案:在【io.seata.rm.tcc.config.TCCFenceConfig】配置文件中加入事务隔离级别属性【isolationLevel】,允许用户通过【seata.tcc.fence.isolationLevel】自定义tccFence的事务隔离级别。在【org.apache.seata.rm.fence.SpringFenceConfig#afterPropertiesSet】中判断 如果用户没有自定义事务隔离级别,则使用默认的事务隔离级别,相反,如果用户自定义了事务隔离级别,那么此时将【TransactionTemplate】的事务隔离级别 替换成 自定义事务隔离级别。这样,就可以通过这个拓展点,解决RR级别下的死锁问题。
方案3 or 方案4改动后,当prepare阶段发生悬挂 && rollback阶段也发生悬挂时,报错如下,避免了死锁问题:
2024-07-15 23:28:58.090 ERROR 59548 --- [h_RMROLE_1_4_24] io.seata.rm.AbstractResourceManager : rollback TCC resource error, resourceId: updateInventoryAcquire, xid: 10.244.137.109:8091:8944687993233979184.
io.seata.rm.tcc.exception.TCCFenceException: Insert tcc fence record duplicate key exception. xid= 10.244.137.109:8091:8944687993233979184, branchId= 8944687993233979186
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:130)
at io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
at io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
at io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
2024-07-15 23:28:58.090 ERROR 59548 --- [h_RMROLE_1_2_24] io.seata.rm.AbstractResourceManager : rollback TCC resource error, resourceId: updateInventoryAcquire, xid: 10.244.137.109:8091:8944687993233979184.
io.seata.rm.tcc.exception.TCCFenceException: Insert tcc fence record duplicate key exception. xid= 10.244.137.109:8091:8944687993233979184, branchId= 8944687993233979186
at io.seata.rm.tcc.store.db.TCCFenceStoreDataBaseDAO.insertTCCFenceDO(TCCFenceStoreDataBaseDAO.java:130)
at io.seata.rm.tcc.TCCFenceHandler.insertTCCFenceLog(TCCFenceHandler.java:235)
at io.seata.rm.tcc.TCCFenceHandler.lambda$rollbackFence$2(TCCFenceHandler.java:193)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at io.seata.rm.tcc.TCCFenceHandler.rollbackFence(TCCFenceHandler.java:187)
at io.seata.rm.tcc.TCCResourceManager.branchRollback(TCCResourceManager.java:164)
at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67)
at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63)
at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:131)
at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63)
at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:68)
at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35)
at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63)
at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58)
at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:281)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
如果认定这是一个 bug or 优化,我可以尝试提交修改的 PR ~~~
Ⅵ. Environment:
JDK version(e.g. java -version): 11
Seata client/server version: 1.8.0
Database version: 8.0.29