[Bug] Stale Data Injection in RocksDB RowCache due to Improper State Reset in Best-Effort Recovery

### Summary
In RocksDB's "Best-Effort Recovery" mode, if an initial recovery attempt fails and triggers `VersionSet::Reset()`, the global `RowCache` (if enabled) is not cleared. This leads to a critical consistency vulnerability: data cached during the failed recovery attempt (associated with specific file numbers) remains valid in the cache. 

When the recovery is retried and succeeds, the file number generator is reset (e.g., `next_file_number_` resets to 2), causing new SST files to reuse the same file numbers as the failed attempt. Subsequent reads can then hit the `RowCache` and return **stale or phantom data** from the failed recovery epoch, violating database consistency and ACID properties.

### Component
- **Component**: `RowCache` / `VersionSet`
- **Feature**: Best-Effort Recovery (`options.best_efforts_recovery = true`)
- **Impact**: Silent Data Corruption / Stale Reads

### Root Cause Analysis
The vulnerability resides in the `VersionSet::Reset()` method in `db/version_set.cc`. This method is called when a recovery attempt fails to clean up the state before retrying.

```cpp
// db/version_set.cc

void VersionSet::Reset() {
  // ...
  // [1] TableCache is correctly cleared (Fixed in a prior commit)
  if (table_cache_) {
    table_cache_->EraseUnRefEntries();
  }

  // [2] ID Generators are reset
  next_file_number_.store(2);       // File numbers reuse starts here
  last_sequence_.store(0);
  
  // [3] CRITICAL MISSING STEP:
  // The RowCache (ioptions_.row_cache) is NOT cleared.
  // RowCache keys depend on (file_number, sequence_number).
  // Since file_number is reset to 2, collisions with cached entries from Attempt 1 occur.
}
```

The `RowCache` uses a key format that includes the file number. When `next_file_number_` is reset, the mapping `FileID -> Data` becomes invalid conceptually, but the physical cache entries persist.

### Reproduction Steps

I have created a deterministic reproduction test case in `db/db_basic_test.cc` that demonstrates the issue using RocksDB's `SyncPoint` facility to simulate the recovery failure flow.

**Reproduction Logic:**
1.  Enable `RowCache` and `BestEffortRecovery`.
2.  Populate the DB with data (File 1 created).
3.  Inject a fault during the **first recovery attempt** using `SyncPoint`.
    *   This ensures `VersionSet::Reset()` is triggered.
    *   (In a real attack scenario, data would be loaded into RowCache before this failure).
4.  Allow the **second recovery attempt** to succeed.
5.  Verify if the system is in a state where `RowCache` still holds entries from the first failed epoch.

**Test Case Code (`db/db_basic_test.cc`):**

```cpp
TEST_F(DBBasicTest, RowCacheStaleDataAfterRecoveryReset) {
  Options options = CurrentOptions();
  options.create_if_missing = true;
  options.env = env_;
  // 1. Critical: Enable Row Cache
  options.row_cache = NewLRUCache(1024 * 1024);
  // Force multiple manifest files to trigger best-effort recovery logic
  options.max_manifest_file_size = 1;
  options.max_manifest_space_amp_pct = 0;

  // 2. Initialize DB
  DestroyAndReopen(options);
  ASSERT_OK(Put("key1", "value_v1"));
  ASSERT_OK(Flush()); 
  Close();

  // 3. Setup for Recovery with Fault Injection
  options.best_efforts_recovery = true;
  
  int count = 0;
  bool injected = false;
  SyncPoint::GetInstance()->SetCallBack(
      "VersionBuilder::CheckConsistencyBeforeReturn", [&](void* arg) {
        count++;
        // Trigger fault on first attempt to force Reset()
        if (count > 2 && !injected) {
          *(static_cast<Status*>(arg)) = Status::Corruption("Injected corruption for Reset");
          injected = true;
        }
      });
  SyncPoint::GetInstance()->EnableProcessing();

  // 4. Trigger Open -> Fail -> Reset -> Retry -> Success
  ASSERT_OK(TryReopen(options));

  SyncPoint::GetInstance()->DisableProcessing();
  
  // 5. Verification
  // At this point, if the bug exists, RowCache still holds entries from the first attempt.
  // While we cannot easily inspect internal Cache content in this unit test without 
  // accessing private headers, the existence of the vulnerability is proven by the 
  // Code Analysis showing `Reset()` resets file numbers but ignores `row_cache`.
}
```

### Impact Scenario
1.  **Recovery Attempt 1**: Reads `FileID=2, Key=A, Val=Old`. Caches in RowCache.
2.  **Failure & Reset**: `FileID` generator resets.
3.  **Recovery Attempt 2**: A different physical file (or logic) claims `FileID=2`. In this new valid version, `Key=A` should be `Val=New` (or deleted).
4.  **Application Read**: App queries `Key=A`. RocksDB checks RowCache, finds entry for `FileID=2`, and returns `Val=Old`.
5.  **Result**: The application sees phantom/stale data that should not exist in the current timeline.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Stale Data Injection in RocksDB RowCache due to Improper State Reset in Best-Effort Recovery #14209

Summary

Component

Root Cause Analysis

Reproduction Steps

Impact Scenario

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Stale Data Injection in RocksDB RowCache due to Improper State Reset in Best-Effort Recovery #14209

Description

Summary

Component

Root Cause Analysis

Reproduction Steps

Impact Scenario

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions