Skip to content

Lettuce Cluster: Master-Slave Switch Causes Connection Pool Doubling #3595

@kael-aiur

Description

@kael-aiur

Bug Report

When the Redis in cluster mode performs a master-slave switch, the number of connections to the new master node will double the maximum number of connections in the connection pool.

Current Behavior

our redis cluster have 24 nodes using cluster mode, and many service use the same cluster. we found when the master-slave switch occur on this cluster, all of the connection number of service will be about double to the new master node.

Because the lettuce will create new connections to the new master bug never close the old connection(connections created when the node still as a slave)
[root@springapps]# netstat -anp  | grep :637 | awk '{print $5}' | sort | uniq -c
    646 10.47.100.45:6371
    386 10.47.100.45:6372
    388 10.47.100.45:6373
    386 10.47.100.45:6374
    388 10.47.100.45:6375

10.47.100.45:6371 is the new master

Input Code

Details

the key code in io.lettuce.core.cluster.PooledClusterConnectionProvider:

  private boolean isStale(ConnectionKey connectionKey) {

      if (connectionKey.nodeId != null && partitions.getPartitionByNodeId(connectionKey.nodeId) != null) {
          return false;
      }

      if (connectionKey.host != null && partitions.getPartition(connectionKey.host, connectionKey.port) != null) {
          return false;
      }

      return true;
  }

There is not check if the connectionKey intent match the role of partitions.getPartition(connectionKey.host, connectionKey.port), so when the RedisClusterNode change role from slave to master, it's connection still not stale, but it actually stale.

Input Code

Expected behavior/code

this method need to fix this bug, demo code such as:

  private boolean isStale(ConnectionKey connectionKey) {

      if (connectionKey.nodeId != null && partitions.getPartitionByNodeId(connectionKey.nodeId) != null) {
          return false;
      }

      RedisClusterNode node = partitions.getPartition(connectionKey.host, connectionKey.port);

      if (connectionKey.host != null && node != null) {
          // if intent of connectionKey is READ and node role is master, means this connection is stale
          return connectionKey.intent == Intent.READ && node.getRole() == RedisInstance.Role.MASTER;
      }

      return true;
  }

Environment

  • Lettuce version(s): 5.3.0.RELEASE
  • Redis version: 6.2.1

Possible Solution

I will fix this bug at 5.3.x branch

Metadata

Metadata

Assignees

No one assigned

    Labels

    size: small1 to 2 development weeks

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions