Skip to content

[Bug] Heap-Use-After-Free in DBImpl::MakeRoomForWrite during DB Destruction #1293

@yxscc

Description

@yxscc

Summary

There is a race condition between DB::Put/Write operations and DB destruction when the database is under high write pressure.

When the LevelDB write buffer (L0 files) fills up, writer threads are blocked inside DBImpl::MakeRoomForWrite on a condition variable (background_work_finished_signal_). If the DB object is destroyed (delete db) while threads are waiting, the DBImpl destructor does not ensure that waiting writers are safely unblocked or terminated.

Consequently, when the writer threads eventually wake up (e.g., due to SignalAll in destructor or spurious wakeups), they access member variables of the now-destroyed DBImpl object (e.g., versions_, bg_error_, mutex_), leading to a Heap-Use-After-Free (UAF).

Reproduction Steps

Here is a reproduction script that triggers the UAF reliably using AddressSanitizer (ASAN).

reproduce_write_race.cc:

#include <thread>
#include <vector>
#include <string>
#include <iostream>
#include <cassert>
#include <chrono>
#include <atomic>
#include "leveldb/db.h"
#include "leveldb/env.h"

// Race Condition PoC:
// Trigger "StopWritesTrigger" (L0 file limit) to block writers.
// Destroy DB while writers are blocked.
// Expect: Heap-Use-After-Free.

std::atomic<bool> stop_writers{false};

void writer_thread(leveldb::DB* db, int id) {
    int i = 0;
    while (!stop_writers) {
        std::string key = "key-" + std::to_string(id) + "-" + std::to_string(i);
        std::string value = std::string(1024, 'v'); // 1KB value
        
        // This call will block inside MakeRoomForWrite -> Wait()
        // when L0 files limit (12) is reached.
        leveldb::Status s = db->Put(leveldb::WriteOptions(), key, value);
        
        if (!s.ok()) break;
        i++;
    }
}

int main() {
    // Clean up previous run
    system("rm -rf /tmp/testdb_race_write");
    
    leveldb::DB* db;
    leveldb::Options options;
    options.create_if_missing = true;
    // Small buffer to force frequent flushes and trigger L0 limit quickly
    options.write_buffer_size = 128 * 1024; 
    
    leveldb::Status status = leveldb::DB::Open(options, "/tmp/testdb_race_write", &db);
    assert(status.ok());

    std::cout << "Starting stress test (Writers)..." << std::endl;
    std::vector<std::thread> threads;
    for (int i = 0; i < 8; i++) {
        threads.emplace_back(writer_thread, db, i);
    }

    // Run long enough to hit the L0 limit (12 files)
    std::this_thread::sleep_for(std::chrono::seconds(3));

    std::cout << "Simulating Shutdown (delete db) while writers are blocked..." << std::endl;
    
    // TRIGGER: Destroy the DB object
    delete db; 
    
    std::cout << "DB Deleted. Waiting for background threads to crash..." << std::endl;
    
    // Keep process alive to catch the ASAN report
    std::this_thread::sleep_for(std::chrono::seconds(2));
    stop_writers = true;

    for (auto& t : threads) {
        if (t.joinable()) t.join();
    }
    
    return 0;
}

Compilation & Execution:

g++ -fsanitize=address reproduce_write_race.cc -o reproduce_write_race -lleveldb -lpthread
./reproduce_write_race

Actual Behavior

ASAN Report:

=================================================================
==1588138==ERROR: AddressSanitizer: heap-use-after-free on address 0x60400005ab90
READ of size ... at 0x60400005ab90 thread T5
    #0 ... in __interceptor_memcpy
    #1 ... in leveldb::Status::CopyState(char const*)
    #2 ... in leveldb::DBImpl::MakeRoomForWrite(bool)
    #3 ... in leveldb::DBImpl::Write
    ...
freed by thread T0 here:
    #0 ... in operator delete[]
    #1 ... in leveldb::DBImpl::~DBImpl()

Root Cause Analysis

In db/db_impl.cc, the MakeRoomForWrite function loops while waiting for compaction:

while (true) {
  if (!bg_error_.ok()) { ... }
  else if (versions_->NumLevelFiles(0) >= config::kL0_StopWritesTrigger) {
      Log(options_.info_log, "Too many L0 files; waiting...\n");
      // Writer blocks here
      background_work_finished_signal_.Wait(); 
      // <--- When it wakes up after 'delete db', 'this' (DBImpl) is invalid.
  }
  // ...
}

The DBImpl destructor signals waiting threads:

DBImpl::~DBImpl() {
  // ...
  shutting_down_.store(true, ...);
  background_work_finished_signal_.SignalAll(); // Wakes up writers
  // ...
} // Mutex and other members are destroyed here

However, it does not wait for writers_ to exit, nor does MakeRoomForWrite check for shutting_down_ immediately after waking up and before accessing member variables (like versions_ in the loop condition).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions