Skip to content

Missing db_mutex_ lock restoration in FlushJob::MemPurge error path leading to race condition #14198

@yxscc

Description

@yxscc

Summary

I identified a concurrency bug in FlushJob::MemPurge where db_mutex_ is not re-acquired before returning from a specific error path. This leaves the mutex in an unlocked state when returning to the caller FlushJob::Run, which expects the lock to be held.

This violation can lead to AssertHeld failures (in debug mode if assertions are enabled) or undefined behavior/race conditions (in release mode) as subsequent operations like WriteLevel0Table proceed without the lock.

Affected Component

  • File: db/flush_job.cc
  • Function: FlushJob::MemPurge
  • Component: Mempurge (experimental feature)

Details

In FlushJob::MemPurge, the code explicitly unlocks the mutex to perform memory operations:

// db/flush_job.cc:393
db_mutex_->Unlock();

However, if the configured CompactionFilter does not support snapshots (returning IgnoreSnapshots() == false), the function hits an error path that returns immediately without re-locking:

// db/flush_job.cc:478-481
if (compaction_filter != nullptr &&
    !compaction_filter->IgnoreSnapshots()) {
  s = Status::NotSupported(
      "CompactionFilter::IgnoreSnapshots() = false is not supported "
      "anymore.");
  return s; // <--- BUG: Returns with mutex UNLOCKED
}

The caller FlushJob::Run handles this error by falling back to WriteLevel0Table, assuming the lock contract (release-and-reacquire) was honored:

// db/flush_job.cc:291
} else {
  // This will release and re-acquire the mutex.
  s = WriteLevel0Table(); // <--- Called without lock!
}

Inside WriteLevel0Table, the code asserts that the lock is held, which is false in this scenario:

// db/flush_job.cc:849
db_mutex_->AssertHeld();

Reproduction

This issue can be triggered under the following conditions:

  1. experimental_mempurge_threshold > 0 (Mempurge enabled).
  2. A CompactionFilter is installed that returns IgnoreSnapshots() == false.
  3. A flush is triggered with reason kWriteBufferFull.

We verified this by instrumenting FlushJob::Run and observing that the execution flow enters WriteLevel0Table immediately after MemPurge returns Status::NotSupported, while the mutex state remains unlocked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions