Skip to content

[NEW] Primary replica role at the slot level #1372

Open
@zvi-code

Description

@zvi-code

The problem/use-case that the feature addresses

Currently, replicas provide availability, increased durability (from a domain failure perspective), and performance improvements when using Read from Replica (RFR). However, performance scaling is limited to stale reads and does not extend to regular writes\reads. Many customers do not use RFR for various reasons. Additionally, when a primary node fails, all write traffic to that node fails [potentially requiring application-level logic to handle the failure].

Description of the feature

We propose redefining role assignments from the node level to the slot level. In this model, a node can be the primary for certain slots and a replica for others. This involves adjusting the codebase so that any primary/replica designations are applied to slots rather than nodes. Essentially, the node becomes a logical container of compute, memory, and services that manages atomic data entities (slots).

With this approach, we can scale the performance of both writes and reads based on the number of nodes in a shard, eliminating the concept of a replica node. If a node fails, only the slots for which it was the primary are directly impacted, improving fault granularity\isolation.

The recent introduction of the dict-per-slot has shifted many processes to operate at the slot level, which facilitates the transition to this model. As part of this feature we will need to continue going down this path for other flows in the system, including bgsave, for example.

This change would require client support, but for clients that do not have the support we can initially implement the feature in a degenerated form where all slots in a shard have the same primary node, maintaining backward compatibility.

Additional information

An added benefit of this approach is the potential to reduce code complexity by unifying the code paths of replication and slot migration, which are currently two similar processes for maintaining data consistency between nodes.

For Cluster Mode Disabled (CMD), we can consider all data to reside in slot 0. In the long term, we might consider enabling slots (or logical grouping) for CMD, allowing customers to gain the benefits of this model without adopting Cluster Mode Enabled (CME).

Metadata

Metadata

Assignees

No one assigned

    Labels

    client-changes-neededClient changes may be required for this feature

    Type

    No type

    Projects

    Status

    Researching

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions