Assertion failed: (index.IsBound()) crash during corrupted WAL replay - process aborts without recoverable error

**Description**
When opening a DuckDB database with a corrupted or incomplete WAL (Write-Ahead Log) file, DuckDB crashes due to an assertion failure, causing the entire process to abort. This failure occurs before any error can be propagated to the caller, making the situation unrecoverable at the application level.

This behavior is particularly problematic for embedded or desktop applications, where unexpected shutdowns (e.g., force-quit, power loss) are realistic scenarios.

---

**Environment**

* **DuckDB Version:** Tested on 1.1.x, confirmed present in 1.4.3 (latest as of Dec 2024)
* **Bindings / Integration:** `duckdb-rs` v1.4.3 via Rust/Tauri desktop application
* **OS:** macOS 14.x (Apple Silicon), also reproducible on other platforms

---

**Error Message**

```
Assertion failed: (index.IsBound()), function operator(), 
file row_group_collection.cpp, line 671.
```

---

**Steps to Reproduce**

1. Create a DuckDB database and perform write operations.
2. Force-quit the application mid-transaction (simulating a crash or power failure).
3. This leaves a WAL file in an inconsistent or partially written state.
4. Attempt to reopen the database.
5. DuckDB crashes during WAL replay with the assertion failure above.

---

**Expected Behavior**
DuckDB should handle this scenario gracefully by one of the following means:

* Returning a recoverable error (e.g., `Result::Err`) that allows the caller to handle the failure.
* Gracefully skipping or invalidating corrupted WAL entries.
* At minimum, avoiding a hard process abort and allowing the application to continue running.

---

**Actual Behavior**

* The process is terminated via a C++ `abort()` triggered by an assertion.
* No error is returned to the caller.
* Application-level recovery code never runs.

This is especially problematic because:

* Rust’s `catch_unwind` cannot catch C++ `abort()` calls.
* The entire application crashes, not just the database subsystem.
* Users lose all application state and context.
* There is no opportunity for graceful recovery or user-facing error handling.

---

**Workaround**
We implemented a defensive workaround that deletes WAL files before attempting to open the database:

```rust
// Pre-emptively remove WAL files BEFORE attempting to open
let wal_path = format!("{}.wal", db_path);
if std::path::Path::new(&wal_path).exists() {
    std::fs::remove_file(&wal_path)?;
}

// Now safe to open - DuckDB won't try to replay WAL
let conn = Connection::open(&db_path)?;
```

This avoids the crash, but results in data loss for any uncommitted transactions present in the WAL.

---

**Suggested Fix**

* Replace assertions in WAL replay and recovery code paths with proper error handling (return errors instead of asserting).
* Consider adding a configuration option such as:

  * `ignore_corrupted_wal`
  * `wal_recovery_mode = { strict | best_effort | skip }`
* At minimum:

  * Log a warning
  * Skip corrupted WAL entries
  * Avoid calling `abort()` during WAL replay

---

**Relevant Code Path (Approximate)**

```
DuckDBConnection::open()
  → DatabaseInstance::Initialize()
    → StorageManager::LoadDatabase()
      → WAL::Replay()
        → RowGroupCollection::operator()   // assertion failure
          → abort()
```

---

**Impact**

* **Severity:** Critical – causes complete, unrecoverable application crashes
* **Data Loss:** Potential loss of entire application state (not limited to DB contents)
* **User Experience:** Extremely poor; application appears to crash randomly on startup
* **Affected Users:** Anyone using DuckDB in embedded, desktop, or offline-first applications where unexpected shutdowns can occur

---

**Related Areas**
This issue may be related to:

* WAL replay and recovery mechanisms
* Row group state management during recovery
* Checkpointing and commit boundary handling


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Assertion failed: (index.IsBound()) crash during corrupted WAL replay - process aborts without recoverable error #649

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Assertion failed: (index.IsBound()) crash during corrupted WAL replay - process aborts without recoverable error #649

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions