Skip to content

Add multi-database support to cluster mode #1671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
May 4, 2025
Merged

Conversation

xbasel
Copy link
Member

@xbasel xbasel commented Feb 5, 2025

cluster: add multi-database support in cluster mode

Add multi-database support in cluster mode to align with standalone mode
and facilitate migration. Previously, cluster mode was restricted to a
single database (DB0). This change allows multiple databases while
preserving the existing slot-based key distribution.

Key Features:

  • Database-Agnostic Hashing. The hashing algorithm is unchanged.
    Identical keys always map to the same slot across all databases,
    ensuring consistent key distribution and compatibility with
    existing single-database setups.
  • Multi-DB commands support. SELECT, MOVE, and COPY are now supported in
    cluster mode.
  • Fully backward compatible with no API changes.
  • SWAPDB is not supported in cluster mode. It is unsafe due to inconsistency risks.

Command-Level Changes:

  • SELECT / MOVE / COPY are now supported in cluster mode.
  • MOVE / COPY (with db) are rejected (TRYAGAIN error) during slot migration to prevent multi-DB inconsistencies.
  • SWAPDB will return an error if used when cluster mode is enabled.
  • GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE will operate in the context of the selected database.
    This means, for example, that migrating keys in a slot will require iterating and repeating across all databases.

Slot Migration Process:

  • Multi-DB support in cluster mode affects slot migration. Operators should now iterate over all the configured databases.

Transaction Handling (MULTI/EXEC):

  • getNodeByQuery key lookup behavior changed:
    • No key lookups when queuing commands in MULTI, only cross-slot
      validation.
    • Key lookups happen at EXEC time.
    • SELECT inside MULTI/EXEC is now checked, ensuring key validation
      uses the selected DB at lookup.

Valkey-cli:

  • valkey-cli has been updated to support resharding across all databases.

Configuration:

  • Introduce new configuration cluster-databases.
    The new configuration controls the maximal number of databases in cluster mode.

Implements #1319

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 91.00000% with 9 lines in your changes missing coverage. Please review.

Project coverage is 70.84%. Comparing base (2d200df) to head (f2ed97c).
Report is 7 commits behind head on unstable.

Files with missing lines Patch % Lines
src/valkey-cli.c 80.00% 8 Missing ⚠️
src/cluster.c 96.87% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1671      +/-   ##
============================================
- Coverage     71.02%   70.84%   -0.18%     
============================================
  Files           123      123              
  Lines         66116    66173      +57     
============================================
- Hits          46956    46879      -77     
- Misses        19160    19294     +134     
Files with missing lines Coverage Δ
src/cluster_legacy.c 86.78% <100.00%> (+0.37%) ⬆️
src/config.c 78.39% <ø> (-0.05%) ⬇️
src/db.c 89.99% <100.00%> (+0.42%) ⬆️
src/server.c 87.94% <100.00%> (+0.03%) ⬆️
src/server.h 100.00% <ø> (ø)
src/valkey-benchmark.c 62.42% <100.00%> (+0.24%) ⬆️
src/cluster.c 90.24% <96.87%> (+0.21%) ⬆️
src/valkey-cli.c 54.60% <80.00%> (-1.32%) ⬇️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xbasel xbasel marked this pull request as draft February 6, 2025 10:01
@xbasel xbasel marked this pull request as ready for review February 10, 2025 21:37
@xbasel xbasel requested a review from zuiderkwast February 10, 2025 22:13
src/db.c Outdated
@@ -1728,12 +1714,6 @@ void swapMainDbWithTempDb(serverDb *tempDb) {
void swapdbCommand(client *c) {
int id1, id2;

/* Not allowed in cluster mode: we have just DB 0 there. */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be enough for swapdb to work in cluster mode? What will happen in setup with 2 shards, each responsible for half of slots in db's?

Copy link
Member Author

@xbasel xbasel Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this implementation SWAPDB must be executed in all primary nodes. There are three options:

  1. Allow SWAPDB and shift responsibility to the user – Risky, non-atomic, can cause temporary inconsistency and data corruption. Needs strong warnings.
  2. Keep SWAPDB disabled in cluster mode – Safest, avoids inconsistency.
  3. Make SWAPDB cluster-wide and atomic or – Complex, unclear feasibility.

I think option 2 is the safest bet. @JoBeR007 wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SWAPDB replicated as a single command? Then it's atomic.

If it's risky, it's risky in standslone mode with replicas too, right?

I think we can allow it. Swapping the data can only be done in some non-realtime workloads anyway I think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think risky because of replication and risky because of the need to execute SWAPDB on all primary nodes are unrelated just because as a user you can't control first, but user is the main risk in the second case.
I would keep SWAPDB disabled in cluster mode, if we decide to continue with this implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cluster mode, consistency is per slot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, FLUSHDB is very similar in this regard. If a failover happens just before this command has been propagated to replicas, it's a big thing, but it's no surprise I think. The client can use WAIT or check replication offset to make sure the FLUSHDB or SWAPDB was successful on the replicas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this, I think it is not just an issue of Multi-database but is more related to atomic slot migration. If a shard is in a stable state (not undergoing slot migration), then flushdb/flushall/swapdb are safe. However, if slot migration is in progress, it might lead to data inconsistency.

I think this needs to be considered alongside atomic-slot-migration:

  1. During the ATM process, for slots being migrated, if we encounter flushall/flushdb, we can send a command like flushslot or flushslotall to the target shard
  2. As for swapdb, I recommend temporarily prohibiting execution during the ATM process

@PingXie @enjoy-binbin , please also take note of this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. @murphyjacob4 FYI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a comment on the issue about this, but also worth mentioning it's hard to orchestrate SWAPDB. Even in steady state, flushdb and flushall are idempotent (you can send them multiple times) but swapdb isn't. If a command times out on one node, it's hard to reason about if it was successful and how to retry it. I think we should continue to disable SWAPDB in cluster mode for now, unless we introduce an idempotent way to do the swap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe introduce UUID tracking for SWAPDB requests works.
disabling SWAPDB for now.

@soloestoy soloestoy requested review from soloestoy and removed request for zuiderkwast February 12, 2025 06:28
@soloestoy
Copy link
Member

I'm happy that we did "Unified db rehash method for both standalone and cluster #12848" when developing kvstore , which made the implementation of multi-database simpler.

@ranshid ranshid added the release-notes This issue should get a line item in the release notes label Feb 17, 2025
Copy link
Collaborator

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add history to SWAPDB, SELECT, MOVE json files to indicate it's supported since 9.0.

src/db.c Outdated
@@ -1728,12 +1714,6 @@ void swapMainDbWithTempDb(serverDb *tempDb) {
void swapdbCommand(client *c) {
int id1, id2;

/* Not allowed in cluster mode: we have just DB 0 there. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a comment on the issue about this, but also worth mentioning it's hard to orchestrate SWAPDB. Even in steady state, flushdb and flushall are idempotent (you can send them multiple times) but swapdb isn't. If a command times out on one node, it's hard to reason about if it was successful and how to retry it. I think we should continue to disable SWAPDB in cluster mode for now, unless we introduce an idempotent way to do the swap.

@ranshid ranshid added the client-changes-needed Client changes may be required for this feature label Feb 24, 2025
@xbasel
Copy link
Member Author

xbasel commented Mar 3, 2025

documentation: valkey-io/valkey-doc#242

@xbasel xbasel requested a review from a team March 5, 2025 11:38
@hwware
Copy link
Member

hwware commented Mar 5, 2025

It looks like there are still some test cases failed related to multiply db feature. Please fix them first, Thanks

@xbasel xbasel marked this pull request as draft March 5, 2025 18:36
@xbasel xbasel force-pushed the multidb branch 4 times, most recently from 538e23e to 63151ae Compare March 6, 2025 12:14
Signed-off-by: xbasel <[email protected]>
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with the current iteration.

@xbasel xbasel requested a review from ranshid April 29, 2025 12:01
Signed-off-by: xbasel <[email protected]>
@xbasel xbasel requested a review from ranshid April 29, 2025 18:38
@zuiderkwast zuiderkwast added the to-be-merged Almost ready to merge label May 3, 2025
@madolson madolson merged commit 2fe08f8 into valkey-io:unstable May 4, 2025
51 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Valkey 9.0 May 4, 2025
@madolson
Copy link
Member

madolson commented May 4, 2025

I created issues for everything I noticed was a followup, wanted to merge to allow other PRs like atomic slot migration to rebase.

@madolson madolson removed the to-be-merged Almost ready to merge label May 4, 2025
@zuiderkwast
Copy link
Contributor

Almost all tests in Daily failed today. What happened here?

SELECT 1 failed: ERR DB index is out of range

https://github.com/valkey-io/valkey/actions/runs/14826557992

xbasel added a commit to xbasel/valkey that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Signed-off-by: xbasel <[email protected]>
xbasel added a commit to xbasel/valkey that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Signed-off-by: xbasel <[email protected]>
zuiderkwast pushed a commit that referenced this pull request May 6, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR #1671.

Fixes #2049

Signed-off-by: xbasel <[email protected]>
madolson added a commit that referenced this pull request May 6, 2025
One of the new tests that was added uses `CONFIG GET PORT`, which isn't
right one for TLS.

Also removed some other use of the helper which aren't actually used.

Introduced as part of #1671.

---------

Signed-off-by: Madelyn Olson <[email protected]>
SoftlyRaining pushed a commit to SoftlyRaining/valkey that referenced this pull request May 14, 2025
## cluster: add multi-database support in cluster mode

Add multi-database support in cluster mode to align with standalone mode
and facilitate migration. Previously, cluster mode was restricted to a
single database (DB0). This change allows multiple databases while
preserving the existing slot-based key distribution.


### Key Features:
- Database-Agnostic Hashing. The hashing algorithm is unchanged.
  Identical keys always map to the same slot across all databases,
  ensuring consistent key distribution and compatibility with
  existing single-database setups.
- Multi-DB commands support. SELECT, MOVE, and COPY are now supported in
  cluster mode.
- Fully backward compatible with no API changes.
- SWAPDB is not supported in cluster mode. It is unsafe due to
inconsistency risks.

### Command-Level Changes:
- SELECT / MOVE / COPY are now supported in cluster mode.
- MOVE / COPY (with db) are rejected (TRYAGAIN error) during slot
migration to prevent multi-DB inconsistencies.
- SWAPDB will return an error if used when cluster mode is enabled.
- GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE will operate in the context
of the selected database.
This means, for example, that migrating keys in a slot will require
iterating and repeating across all databases.

### Slot Migration Process:
- Multi-DB support in cluster mode affects slot migration. Operators
should now iterate over all the configured databases.
 
### Transaction Handling (MULTI/EXEC):
- getNodeByQuery key lookup behavior changed:
  - No key lookups when queuing commands in MULTI, only cross-slot
    validation.
  - Key lookups happen at EXEC time.
  - SELECT inside MULTI/EXEC is now checked, ensuring key validation
    uses the selected DB at lookup.

### Valkey-cli:
- valkey-cli has been updated to support resharding across all
databases.

### Configuration:
- Introduce new configuration `cluster-databases`.
The new configuration controls the maximal number of databases in
cluster mode.

Implements  valkey-io#1319

---------

Signed-off-by: xbasel <[email protected]>
Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: zhaozhao.zz <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
SoftlyRaining pushed a commit to SoftlyRaining/valkey that referenced this pull request May 14, 2025
Re-adds a statement to restore the `singledb` config that was
accidentally removed in PR valkey-io#1671.

Fixes valkey-io#2049

Signed-off-by: xbasel <[email protected]>
SoftlyRaining pushed a commit to SoftlyRaining/valkey that referenced this pull request May 14, 2025
One of the new tests that was added uses `CONFIG GET PORT`, which isn't
right one for TLS.

Also removed some other use of the helper which aren't actually used.

Introduced as part of valkey-io#1671.

---------

Signed-off-by: Madelyn Olson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-changes-needed Client changes may be required for this feature needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.