Skip to content

[Bug]: 启动阶段的死锁问题 #45

@szl97

Description

@szl97

Version

main

Component

API (Backend)

Bug Description

  • 问题1:
@Transactional(isolation = SERIALIZABLE)
    public Long register(String ip, int port) {
        InstanceRecord rec = db.selectFrom(INSTANCE)
                .where(INSTANCE.IP.eq(ip).and(INSTANCE.PORT.eq(port))).fetchOne();
        if(rec == null) {
            rec = findIdle();
            rec.set(INSTANCE.IP, ip);
            rec.set(INSTANCE.PORT, port);
        }
        rec.setStatus(1);
        rec.set(INSTANCE.MTIME, LocalDateTime.now());
        rec.store();
        return rec.getId();
    }

这段代码,SERIALIZABLE隔离级别下,当select没查到数据时,会在(ip,port)这个索引上加共享间隙锁,插入的时候会加写锁。如果启动多台服务,另一个服务的在select和insert之间执行了select,且间隙锁的范围相同时就会发生死锁。

  • 问题2:
private InstanceRecord findIdle() {
        InstanceRecord rec = db.selectFrom(INSTANCE)
                .where(INSTANCE.STATUS.eq(0)).limit(1).fetchAny();
        if(rec == null) {
            rec = INSTANCE.newRecord();
            rec.set(INSTANCE.CTIME, LocalDateTime.now());
            rec.attach(db.configuration());
        }
        return rec;
    }

这段代码,两个服务可能查到同一条记录时,会对该条数据加读锁,都无法修改,出现死锁。

降低事务的隔离级别为READ COMMITTED + 使用forUpdate + skipLcok 。不要用默认的隔离级别,查询status执行forUpdate会锁表,就完全串行了。

Steps to Reproduce

多台服务启动

Expected Behavior

无死锁

Environment

any

Error Logs

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions