Summary
There is a race condition between DB::Put/Write operations and DB destruction when the database is under high write pressure.
When the LevelDB write buffer (L0 files) fills up, writer threads are blocked inside DBImpl::MakeRoomForWrite on a condition variable (background_work_finished_signal_). If the DB object is destroyed (delete db) while threads are waiting, the DBImpl destructor does not ensure that waiting writers are safely unblocked or terminated.
Consequently, when the writer threads eventually wake up (e.g., due to SignalAll in destructor or spurious wakeups), they access member variables of the now-destroyed DBImpl object (e.g., versions_, bg_error_, mutex_), leading to a Heap-Use-After-Free (UAF).
Reproduction Steps
Here is a reproduction script that triggers the UAF reliably using AddressSanitizer (ASAN).
reproduce_write_race.cc:
#include <thread>
#include <vector>
#include <string>
#include <iostream>
#include <cassert>
#include <chrono>
#include <atomic>
#include "leveldb/db.h"
#include "leveldb/env.h"
// Race Condition PoC:
// Trigger "StopWritesTrigger" (L0 file limit) to block writers.
// Destroy DB while writers are blocked.
// Expect: Heap-Use-After-Free.
std::atomic<bool> stop_writers{false};
void writer_thread(leveldb::DB* db, int id) {
int i = 0;
while (!stop_writers) {
std::string key = "key-" + std::to_string(id) + "-" + std::to_string(i);
std::string value = std::string(1024, 'v'); // 1KB value
// This call will block inside MakeRoomForWrite -> Wait()
// when L0 files limit (12) is reached.
leveldb::Status s = db->Put(leveldb::WriteOptions(), key, value);
if (!s.ok()) break;
i++;
}
}
int main() {
// Clean up previous run
system("rm -rf /tmp/testdb_race_write");
leveldb::DB* db;
leveldb::Options options;
options.create_if_missing = true;
// Small buffer to force frequent flushes and trigger L0 limit quickly
options.write_buffer_size = 128 * 1024;
leveldb::Status status = leveldb::DB::Open(options, "/tmp/testdb_race_write", &db);
assert(status.ok());
std::cout << "Starting stress test (Writers)..." << std::endl;
std::vector<std::thread> threads;
for (int i = 0; i < 8; i++) {
threads.emplace_back(writer_thread, db, i);
}
// Run long enough to hit the L0 limit (12 files)
std::this_thread::sleep_for(std::chrono::seconds(3));
std::cout << "Simulating Shutdown (delete db) while writers are blocked..." << std::endl;
// TRIGGER: Destroy the DB object
delete db;
std::cout << "DB Deleted. Waiting for background threads to crash..." << std::endl;
// Keep process alive to catch the ASAN report
std::this_thread::sleep_for(std::chrono::seconds(2));
stop_writers = true;
for (auto& t : threads) {
if (t.joinable()) t.join();
}
return 0;
}
Compilation & Execution:
g++ -fsanitize=address reproduce_write_race.cc -o reproduce_write_race -lleveldb -lpthread
./reproduce_write_race
Actual Behavior
ASAN Report:
=================================================================
==1588138==ERROR: AddressSanitizer: heap-use-after-free on address 0x60400005ab90
READ of size ... at 0x60400005ab90 thread T5
#0 ... in __interceptor_memcpy
#1 ... in leveldb::Status::CopyState(char const*)
#2 ... in leveldb::DBImpl::MakeRoomForWrite(bool)
#3 ... in leveldb::DBImpl::Write
...
freed by thread T0 here:
#0 ... in operator delete[]
#1 ... in leveldb::DBImpl::~DBImpl()
Root Cause Analysis
In db/db_impl.cc, the MakeRoomForWrite function loops while waiting for compaction:
while (true) {
if (!bg_error_.ok()) { ... }
else if (versions_->NumLevelFiles(0) >= config::kL0_StopWritesTrigger) {
Log(options_.info_log, "Too many L0 files; waiting...\n");
// Writer blocks here
background_work_finished_signal_.Wait();
// <--- When it wakes up after 'delete db', 'this' (DBImpl) is invalid.
}
// ...
}
The DBImpl destructor signals waiting threads:
DBImpl::~DBImpl() {
// ...
shutting_down_.store(true, ...);
background_work_finished_signal_.SignalAll(); // Wakes up writers
// ...
} // Mutex and other members are destroyed here
However, it does not wait for writers_ to exit, nor does MakeRoomForWrite check for shutting_down_ immediately after waking up and before accessing member variables (like versions_ in the loop condition).
Summary
There is a race condition between
DB::Put/Writeoperations andDBdestruction when the database is under high write pressure.When the LevelDB write buffer (L0 files) fills up, writer threads are blocked inside
DBImpl::MakeRoomForWriteon a condition variable (background_work_finished_signal_). If theDBobject is destroyed (delete db) while threads are waiting, theDBImpldestructor does not ensure that waiting writers are safely unblocked or terminated.Consequently, when the writer threads eventually wake up (e.g., due to
SignalAllin destructor or spurious wakeups), they access member variables of the now-destroyedDBImplobject (e.g.,versions_,bg_error_,mutex_), leading to a Heap-Use-After-Free (UAF).Reproduction Steps
Here is a reproduction script that triggers the UAF reliably using AddressSanitizer (ASAN).
reproduce_write_race.cc:Compilation & Execution:
Actual Behavior
ASAN Report:
Root Cause Analysis
In
db/db_impl.cc, theMakeRoomForWritefunction loops while waiting for compaction:The
DBImpldestructor signals waiting threads:However, it does not wait for
writers_to exit, nor doesMakeRoomForWritecheck forshutting_down_immediately after waking up and before accessing member variables (likeversions_in the loop condition).