Skip to content

Commit 5ba686f

Browse files
Reduce etcd lease keepalive renewal interval to 1/3rd of TTL (#1573)
## Summary Reduces the etcd lease keepalive interval from 70% to 33% of TTL to align with etcd's official Go client behavior. ## Problem LavinMQ instances sometimes experience sporadic leadership loss with "Lease expired" errors. The 70% sleep interval left only a 3-second buffer (with 10s TTL) for keepalive requests to complete, which could be insufficient for handling network latency or slow responses. It's not clear if the short time for keepalive requests is the issue, but this should hopefully reduce the chances of losing leadership without good reason. ## Solution Changed interval from `ttl * 0.7` to `ttl / 3`, matching etcd's clientv3: ```go nextKeepAlive := time.Now().Add((time.Duration(karesp.TTL) * time.Second) / 3.0) ``` **With 10s TTL:** - Before: 7s sleep, 3s buffer - After: 3.3s sleep, 6.7s buffer Reference: https://github.com/etcd-io/etcd/blob/main/client/v3/lease.go#L542 Might fix #1486 🤖 Generated with [Claude Code](https://claude.com/claude-code)
1 parent e7cecf3 commit 5ba686f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/lavinmq/etcd/lease.cr

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ module LavinMQ
3030

3131
private def keepalive_loop(ttl : Int32)
3232
loop do
33-
sleep (ttl * 0.7).seconds
33+
sleep (ttl / 3).seconds
3434
ttl = @etcd.lease_keepalive(@id)
3535
end
3636
rescue ex : Etcd::Error # only rescue etcd errors

0 commit comments

Comments
 (0)