apache
diff --git a/‎.asf.yaml‎
Lines changed: 1 addition & 0 deletions b/‎.asf.yaml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.github/workflows/kvrocks.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/kvrocks.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎AGENTS.md‎
Lines changed: 165 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 165 additions & 0 deletions
diff --git a/‎NOTICE‎
Lines changed: 1 addition & 1 deletion b/‎NOTICE‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎cmake/rocksdb.cmake‎
Lines changed: 2 additions & 2 deletions b/‎cmake/rocksdb.cmake‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎cmake/zlib.cmake‎
Lines changed: 2 additions & 2 deletions b/‎cmake/zlib.cmake‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎kvrocks.conf‎
Lines changed: 15 additions & 0 deletions b/‎kvrocks.conf‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎src/cluster/cluster.cc‎
Lines changed: 1 addition & 1 deletion b/‎src/cluster/cluster.cc‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/cluster/replication.cc‎
Lines changed: 32 additions & 6 deletions b/‎src/cluster/replication.cc‎
Lines changed: 32 additions & 6 deletions
diff --git a/‎src/cluster/replication.h‎
Lines changed: 3 additions & 0 deletions b/‎src/cluster/replication.h‎
Lines changed: 3 additions & 0 deletions
@@ -67,6 +67,7 @@ github:
     '2.12': {}
     '2.13': {}
     '2.14': {}
+    '2.15': {}
 
 notifications:
   commits:      commits@kvrocks.apache.org
 
@@ -58,7 +58,7 @@ jobs:
     steps:
       - uses: actions/checkout@v4
       - name: Check typos
-        uses: crate-ci/typos@v1.42.0
+        uses: crate-ci/typos@v1.43.1
         with:
           config: .github/config/typos.toml
       - uses: apache/skywalking-eyes/header@v0.7.0
 
@@ -0,0 +1,165 @@
+# AGENTS.md
+
+This file provides guidance to AI coding agents (e.g., Claude Code, Cursor, ChatGPT Codex, Gemini) when working with code in this repository.
+
+While working on Apache Kvrocks, please remember:
+
+- Always use English in code and comments.
+- Only add meaningful comments when the code's behavior is difficult to understand.
+- Add or update tests to cover externally observable behavior and regressions when you change or add functionality.
+- Always run the formatter before submitting changes.
+
+## Build and Development Commands
+
+### Building
+
+```bash
+# Build kvrocks and utilities
+./x.py build                    # Build to ./build directory
+./x.py build -j N               # Build with N parallel jobs
+./x.py build --unittest         # Build with unit tests
+./x.py build -DENABLE_OPENSSL=ON  # Build with TLS support
+./x.py build --ninja            # Use Ninja build system
+./x.py build --skip-build       # Only run CMake configure
+./x.py build -DCMAKE_BUILD_TYPE=Debug  # Debug build
+
+# Run a local server
+./build/kvrocks -c kvrocks.conf
+
+# Fetch dependencies
+./x.py fetch-deps               # Fetch dependency archives
+```
+
+### Testing
+
+```bash
+# Build and run C++ unit tests
+./x.py build --unittest
+./x.py test cpp
+
+# Run Go integration tests
+./x.py test go
+
+# Run specific Go test by path
+./x.py test go tests/gocase/unit/...
+```
+
+### Lint
+
+You must run the formatter and linters before submitting code changes. This ensures code quality and consistency across the project. It requires installing `clang-format`, `clang-tidy`, and `golangci-lint` locally. Please refer to the CONTRIBUTING.md for setup instructions.
+
+```bash
+# Format code (must pass before submitting)
+./x.py format
+
+# Check code format (fails if not formatted)
+./x.py check format
+
+# Run clang-tidy
+./x.py check tidy
+
+# Run golangci-lint for Go tests
+./x.py check golangci-lint
+```
+
+## Architecture Overview
+
+Apache Kvrocks is a distributed key-value NoSQL database compatible with the Redis protocol, using RocksDB as its storage engine.
+
+### Core Structure
+
+- **`src/server/`**: Main server orchestration, connection handling, and worker threads. The `Server` class manages the event loop, worker threads, and coordinates all components.
+- **`src/storage/`**: RocksDB integration layer. Key classes:
+  - `Storage`: Manages RocksDB instance, column families, and write batches
+  - `Context`: Passes snapshot and batch between APIs for transactional consistency
+- **`src/commands/`**: Redis protocol command implementations. Each command type has a corresponding `Commander` subclass.
+- **`src/types/`**: Redis data structure implementations (String, Hash, List, Set, ZSet, Stream, etc.)
+- **`src/cluster/`**: Cluster management, slot migration, and replication.
+- **`src/search/`**: Full-text search and vector search (HNSW) implementation.
+- **`src/config/`**: Server configuration parsing and management.
+- **`src/cli/`**: Command-line interface utilities.
+- **`src/common/`**: Shared utilities and helper functions.
+- **`src/stats/`**: Statistics and metrics collection.
+
+### Key Patterns
+
+- **Column Families**: 8 column families are used - PrimarySubkey, Metadata, SecondarySubkey, PubSub, Propagate, Stream, Search, Index.
+- **Command Registration**: Commands are registered via the `REDIS_REGISTER_COMMANDS` macro with flags like `kCmdWrite`, `kCmdReadOnly`, `kCmdBlocking`, etc.
+- **Write Batch with Index**: Used for transactional mode to group writes before commit.
+- **Worker Thread Model**: Libevent-based async I/O with dedicated worker threads.
+- **Namespace Isolation**: Token-based multi-tenancy using the `__namespace` column family.
+
+### Data Encoding
+
+- `METADATA_ENCODING_VERSION=1` (default): Encodes 64-bit size and expire time in milliseconds.
+- `METADATA_ENCODING_VERSION=0`: Legacy encoding.
+
+Refer to https://kvrocks.apache.org/community/data-structure-on-rocksdb for more details.
+
+## Coding Style and Naming Conventions
+
+- C++ formatting follows `.clang-format` (Google-based, 2-space indent, 120-column limit, sorted includes).
+- Use `.cc`/`.h` file extensions with `snake_case` filenames.
+- Types use `PascalCase`; match existing patterns in nearby code.
+- Use existing utilities and helper functions when possible; avoid reinventing the wheel.
+- Go code should stay `gofmt`-clean and comply with `tests/gocase/.golangci.yml`.
+
+## Testing Guidelines
+
+You could provide Go tests for integration-level verification of command behaviors and C++ unit tests for internal logic. Focus on testing new features or bug fixes, and avoid adding tests that don't verify meaningful behavior changes.
+
+- **Go Integration Tests** (`tests/gocase/`): Use `*_test.go` files, organized by feature (unit/, integration/, tls/).
+- **C++ Unit Tests** (`tests/cppunit/`): Use `*_test.cc` files with GoogleTest framework.
+- Add or update tests alongside behavior changes.
+- Prefer focused unit tests; add integration coverage when commands or replication/storage behaviors change.
+- Use `./x.py test ...` entry points for consistent setup.
+
+## Commit Message Format
+
+Use conventional commits with a scope indicating the affected component:
+
+```
+feat(rdb): add DUMP support for SortedInt type
+fix(replication): prevent WAL exhaustion from slow consumers
+fix(string): add empty string value check for INCR to match Redis behavior
+perf(hash): use MultiGet to reduce RocksDB calls in HMSET
+chore(deps): Bump rocksdb to v10.10.1
+chore(ci): bump crate-ci/typos action to v1.43.1
+chore(tests): replace to slices.Reverse() in go test
+```
+
+Common scopes: `server`, `storage`, `commands`, `cluster`, `search`, `types`, `replication`, `rdb`, `stream`, `hash`, `string`, `list`, `set`, `zset`, `deps`, `ci`, `tests`, `conf`.
+
+## Common Tasks
+
+### Adding a New Command
+
+1. Create or update the command handler in `src/commands/`.
+2. Implement the `Commander` subclass with `Parse()` and `Execute()` methods.
+3. Register the command using `REDIS_REGISTER_COMMANDS` macro with appropriate flags.
+4. Add the underlying data operation in `src/types/` if needed.
+5. Add C++ unit tests in `tests/cppunit/`.
+6. Add Go integration tests in `tests/gocase/`.
+
+### Adding a New Data Type
+
+1. Implement the type in `src/types/` following existing patterns.
+2. Define the metadata encoding in `src/storage/`.
+3. Add command handlers in `src/commands/`.
+4. Register commands with the `REDIS_REGISTER_COMMANDS` macro.
+5. Add tests for both the type operations and command handlers.
+
+### Debugging
+
+1. Check server logs (configurable log level in kvrocks config).
+2. Use the `DEBUG` command for runtime inspection.
+3. Use sanitizers via build flags for memory and thread issues.
+4. Refer to `tests/lsan-suppressions` and `tests/tsan-suppressions` for known suppression rules.
+
+## Important Notes
+
+- Kvrocks aims for Redis protocol compatibility; always verify behavior against Redis when implementing or fixing commands.
+- All changes must pass `./x.py check format` and `./x.py check tidy`.
+- Don't change public command behavior unless requested.
+- RocksDB is the core storage dependency; be cautious with storage-layer changes.
+- Adding a new column family breaks forward compatibility; avoid this if possible and prefer using existing column families.
@@ -1,5 +1,5 @@
 Apache Kvrocks
-Copyright 2022-2025 The Apache Software Foundation
+Copyright 2022-2026 The Apache Software Foundation
 
 This product includes software developed at
 The Apache Software Foundation (http://www.apache.org/).
 
@@ -26,8 +26,8 @@ endif()
 include(cmake/utils.cmake)
 
 FetchContent_DeclareGitHubWithMirror(rocksdb
-  facebook/rocksdb v10.9.1
-  MD5=06a521bf5749f73d0da29844f9ae6fca
+  facebook/rocksdb v10.10.1
+  MD5=dcef50080a4a6c0c0b4b77fd04c60502
 )
 
 FetchContent_GetProperties(jemalloc)
 
@@ -20,8 +20,8 @@ include_guard()
 include(cmake/utils.cmake)
 
 FetchContent_DeclareGitHubWithMirror(zlib
-  zlib-ng/zlib-ng 2.3.2
-  MD5=7818ea3f3ad80873674faf500fd12a0d
+  zlib-ng/zlib-ng 2.3.3
+  MD5=a2c8df556b61266f100d331268123115
 )
 
 FetchContent_MakeAvailableWithArgs(zlib
 
@@ -231,6 +231,21 @@ replication-delay-bytes 16384
 # Default: 16 updates
 replication-delay-updates 16
 
+# Maximum sequence lag allowed before disconnecting a slow replica.
+# If a replica falls behind by more than this many sequences, the master will
+# disconnect it to prevent WAL exhaustion. The replica can then reconnect and
+# attempt partial sync (psync) if the sequence is still available.
+# Set to 0 to disable this check (default).
+# Default: 0 (disabled)
+max-replication-lag 0
+
+# Timeout in milliseconds for socket send operations to replicas.
+# If sending data to a replica blocks for longer than this timeout,
+# the connection will be dropped. This prevents the replication feed thread
+# from blocking indefinitely on slow consumers.
+# Default: 30000 (30 seconds)
+replication-send-timeout-ms 30000
+
 # TCP listen() backlog.
 #
 # In high requests-per-second environments you need an high backlog in order
 
@@ -55,7 +55,7 @@ Cluster::Cluster(Server *srv, std::vector<std::string> binds, int port)
 // cluster data, so these commands should be executed exclusively, and ReadWriteLock
 // also can guarantee accessing data is safe.
 bool Cluster::SubCommandIsExecExclusive(const std::string &subcommand) {
-  std::array subcommands = {"setnodes", "setnodeid", "setslot", "import", "reset"};
+  std::array subcommands = {"setnodes", "setnodeid", "setslot", "import", "reset", "flushslots"};
 
   return std::any_of(std::begin(subcommands), std::end(subcommands),
                      [&subcommand](const std::string &val) { return util::EqualICase(val, subcommand); });
 
@@ -63,7 +63,9 @@ FeedSlaveThread::FeedSlaveThread(Server *srv, redis::Connection *conn, rocksdb::
       next_repl_seq_(next_repl_seq),
       req_(srv),
       max_delay_bytes_(srv->GetConfig()->max_replication_delay_bytes),
-      max_delay_updates_(srv->GetConfig()->max_replication_delay_updates) {}
+      max_delay_updates_(srv->GetConfig()->max_replication_delay_updates),
+      max_replication_lag_(srv->GetConfig()->max_replication_lag),
+      send_timeout_ms_(srv->GetConfig()->replication_send_timeout_ms) {}
 
 Status FeedSlaveThread::Start() {
   auto s = util::CreateThread("feed-replica", [this] {
@@ -184,6 +186,21 @@ void FeedSlaveThread::loop() {
   while (!IsStopped()) {
     auto curr_seq = next_repl_seq_.load();
 
+    // Check replication lag - disconnect slow consumers before WAL is exhausted
+    // Skip check if max_replication_lag_ is 0 (feature disabled)
+    if (max_replication_lag_ > 0) {
+      auto latest_seq = srv_->storage->LatestSeqNumber();
+      if (latest_seq > curr_seq) {
+        auto lag = static_cast<int64_t>(latest_seq - curr_seq);
+        if (lag > max_replication_lag_) {
+          ERROR("Replication lag {} exceeds max allowed {} for slave {}:{}, disconnecting to prevent WAL exhaustion",
+                lag, max_replication_lag_, conn_->GetAnnounceIP(), conn_->GetListeningPort());
+          Stop();
+          return;
+        }
+      }
+    }
+
     if (!iter_ || !iter_->Valid()) {
       if (iter_) INFO("WAL was rotated, would reopen again");
       if (!srv_->storage->WALHasNewData(curr_seq) || !srv_->storage->GetWALIter(curr_seq, &iter_).IsOK()) {
@@ -221,10 +238,12 @@ void FeedSlaveThread::loop() {
         batches_bulk += redis::BulkString("_getack");
       }
 
-      // Send entire bulk which contain multiple batches
-      auto s = util::SockSend(conn_->GetFD(), batches_bulk, conn_->GetBufferEvent());
+      // Send entire bulk which contain multiple batches with timeout
+      // This prevents blocking indefinitely on slow consumers
+      auto s = util::SockSendWithTimeout(conn_->GetFD(), batches_bulk, conn_->GetBufferEvent(), send_timeout_ms_);
       if (!s.IsOK()) {
-        ERROR("Write error while sending batch to slave: {}. batches: 0x{}", s.Msg(), util::StringToHex(batches_bulk));
+        ERROR("Write error while sending batch to slave {}:{}: {}. batch_size={}", conn_->GetAnnounceIP(),
+              conn_->GetListeningPort(), s.Msg(), batches_bulk.size());
         Stop();
         return;
       }
@@ -260,9 +279,14 @@ void ReplicationThread::CallbacksStateMachine::ConnEventCB(bufferevent *bev, int
   }
   if (events & (BEV_EVENT_ERROR | BEV_EVENT_EOF)) {
     ERROR("[replication] connection error/eof, reconnect the master");
-    // Wait a bit and reconnect
+    // Wait with exponential backoff before reconnecting
+    constexpr int kMaxBackoffSeconds = 60;
+    constexpr int kMaxShiftBits = 6;  // Cap shift to avoid UB; 2^6 = 64 then clamped to 60
     repl_->repl_state_.store(kReplConnecting, std::memory_order_relaxed);
-    std::this_thread::sleep_for(std::chrono::seconds(1));
+    int attempts = repl_->reconnect_attempts_.fetch_add(1, std::memory_order_relaxed);
+    int backoff_secs = std::min(1 << std::min(attempts, kMaxShiftBits), kMaxBackoffSeconds);
+    WARN("[replication] waiting {} seconds before reconnecting (attempt {})", backoff_secs, attempts + 1);
+    std::this_thread::sleep_for(std::chrono::seconds(backoff_secs));
     Stop();
     Start();
   }
@@ -634,6 +658,7 @@ ReplicationThread::CBState ReplicationThread::tryPSyncReadCB(bufferevent *bev) {
   } else {
     // PSYNC is OK, use IncrementBatchLoop
     INFO("[replication] PSync is ok, start increment batch loop");
+    reconnect_attempts_.store(0, std::memory_order_relaxed);  // Reset backoff counter on successful connection
     return CBState::NEXT;
   }
 }
@@ -879,6 +904,7 @@ ReplicationThread::CBState ReplicationThread::fullSyncReadCB(bufferevent *bev) {
         return CBState::RESTART;
       }
       INFO("[replication] Succeeded restoring the backup, fullsync was finish");
+      reconnect_attempts_.store(0, std::memory_order_relaxed);  // Reset backoff counter on successful fullsync
       post_fullsync_cb_();
 
       // It needs to reload namespaces from DB after the full sync is done,
 
@@ -91,6 +91,8 @@ class FeedSlaveThread {
   // Configurable delay limits
   size_t max_delay_bytes_;
   size_t max_delay_updates_;
+  int64_t max_replication_lag_;
+  int send_timeout_ms_;
 
   void loop();
   void checkLivenessIfNeed();
@@ -166,6 +168,7 @@ class ReplicationThread : private EventCallbackBase<ReplicationThread> {
   const bool replication_group_sync_ = false;
   std::atomic<int64_t> last_io_time_secs_ = 0;
   int64_t last_ack_time_secs_ = 0;
+  std::atomic<int> reconnect_attempts_ = 0;  // For exponential backoff on reconnection
   bool next_try_old_psync_ = false;
   bool next_try_without_announce_ip_address_ = false;
Original file line number	Diff line number	Diff line change
`@@ -26,8 +26,8 @@ endif()`
`26`	`26`	`include(cmake/utils.cmake)`
`27`	`27`
`28`	`28`	`FetchContent_DeclareGitHubWithMirror(rocksdb`
`29`		`- facebook/rocksdb v10.9.1`
`30`		`- MD5=06a521bf5749f73d0da29844f9ae6fca`
	`29`	`+ facebook/rocksdb v10.10.1`
	`30`	`+ MD5=dcef50080a4a6c0c0b4b77fd04c60502`
`31`	`31`	`)`
`32`	`32`
`33`	`33`	`FetchContent_GetProperties(jemalloc)`
Original file line number	Diff line number	Diff line change
`@@ -20,8 +20,8 @@ include_guard()`
`20`	`20`	`include(cmake/utils.cmake)`
`21`	`21`
`22`	`22`	`FetchContent_DeclareGitHubWithMirror(zlib`
`23`		`- zlib-ng/zlib-ng 2.3.2`
`24`		`- MD5=7818ea3f3ad80873674faf500fd12a0d`
	`23`	`+ zlib-ng/zlib-ng 2.3.3`
	`24`	`+ MD5=a2c8df556b61266f100d331268123115`
`25`	`25`	`)`
`26`	`26`
`27`	`27`	`FetchContent_MakeAvailableWithArgs(zlib`