Attach fixes sim by glommer · Pull Request #5509 · tursodatabase/turso

glommer · 2026-02-21T13:39:34Z

Description

Add attach to the simulator. Then fix the absolute flurry of bugs that it found =)

Motivation and context

Productizing ATTACH

Description of AI Usage

I reviewed the simulator code closely with Claude, then when happy let it run overnight with a prompt instructing it to run simulations, stop when finding a bug, ignore bugs not related to attach, and then fix, commit, and move on.

turso-bot

Please review @jussisaurio

Add full infrastructure for testing ATTACH databases in the simulator: - SimulatorEnv tracks attached databases with path helpers and cleanup - Selective per-file fault injection via inject_fault_selective() on SimIO - CREATE TABLE redirect to attached DBs (30% probability) in plan generation - Proper QualifiedName/DoublyQualified handling for dotted table names in sql_generation (FROM clauses, column references, result columns) - Re-ATTACH on database reopen with DatabaseOpts::with_attach(true) - Fix core/connection.rs attach_database() to reuse parent DB IO instead of creating a new PlatformIO (enables simulator fault injection) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Multiple interrelated fixes that make ATTACH database writes work correctly across connections: - Pager initialization: attach_database() now uses Database::_init() instead of init_pager(), ensuring encryption and auto_vacuum config are applied to attached pagers. - Schema visibility: Schema changes on attached DBs are now kept in a connection-local copy until WAL commit succeeds, then published to the shared Database via publish_attached_schema(). Previously, changes were written directly to the shared schema, making uncommitted changes visible to other connections. - Write lock handling: end_attached_write_txns() now iterates ALL attached pagers (via holds_write_lock()) instead of only the current program's write_databases set. This fixes explicit transactions where the COMMIT statement's program differs from the statement that acquired the write lock. - IO yield during commit: Added CommittingAttached commit state to properly handle IO yields from commit_dirty_pages on attached pagers. Fixed commit_txn() early return that skipped re-entry when tx_state was already None (set by main pager commit before attached commit completes). - Read transactions: SELECT on attached databases now emits Transaction opcodes to acquire WAL read locks, preventing stale reads. Added begin_read_on_database() to ProgramBuilder. - Connection drop: WAL locks on attached pagers are now released when the connection is dropped. Reproducer: cargo run --bin limbo_sim -- --seed 2 -n 200 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ALTER TABLE RENAME and other operations on attached databases were updating the wrong sqlite_schema table because the UPDATE statement wasn't qualified with the database name. Also, the sqlite_schema btree lookup used the main schema instead of the attached database's schema via with_schema(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When SchemaUpdated triggers a reprepare, attached database pagers could retain stale read locks and connection-local schema changes from the previous prepare. This caused an infinite SchemaUpdated loop because the pager still saw old page 1 with a stale schema cookie. Now reprepare() releases locks and discards local schema for attached databases in write_databases before re-translating. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When ALTER TABLE RENAME was executed on an attached database table (e.g., aux1.foo RENAME TO bar), the shadow model renamed the table to just bar instead of aux1.bar, losing the database prefix. Subsequent queries generated by the simulator would use the bare name, which the database resolver could not find. Reproducer: cargo run --bin limbo_sim -- --seed 500 -n 200 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

resolve_upsert_target() and get_index() in emit_upsert() were using resolver.schema() (main DB schema) instead of the attached DB schema, causing ON CONFLICT clause does not match any PRIMARY KEY or UNIQUE constraint errors when INSERT ... ON CONFLICT targets a table in an attached database. Fix: pass database_id through bind_insert() and use resolver.with_schema(database_id, ...) for both resolve_upsert_target and get_index calls. Reproducer: simulator seeds 22, 36, 58, 71, 73, 75 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

reprepare() only iterated write_databases to clean up attached pager locks, but SELECT queries on attached databases only populate read_databases. This caused an infinite SchemaUpdated loop when a DDL change followed by ROLLBACK left stale schema state on the attached pager used by a subsequent SELECT. Fix: chain both write_databases and read_databases iterators when cleaning up attached pager state during reprepare. Reproducer: simulator seeds 64, 88 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two related issues with attached database write transactions: 1. Statement savepoints were only created for the main pager (db 0). When a write to an attached DB triggered ABORT (e.g. UNIQUE constraint violation), the rollback had no savepoint to restore on the attached pager, leaving dirty pages in place. Fix: open savepoints on attached pagers in the Transaction opcode and handle release/rollback in end_statement(). 2. end_attached_write_txns() silently swallowed commit_dirty_pages errors (logging a warning and falling through to success), causing data loss when a fault occurred during attached pager commit. Fix: propagate the error with return Err(e). Reproducer: simulator seeds 245 (savepoints), 251 (error propagation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When commit_txn yields on IO during the attached DB commit phase (CommittingAttached), re-entry into op_auto_commit needs to recognize this state and continue the commit. Previously only CommitState::Committing was checked, causing attached DB commits that yielded on IO to fall through and attempt a new commit instead of resuming. Reproducer: limbo_sim --seed 16 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When an explicit ROLLBACK or error-triggered rollback occurs after DDL operations (e.g., DROP TABLE) on attached databases, the connection-local schema cache was not being cleared. This left stale schema entries that showed tables as dropped even after the rollback restored them on disk. The fix adds database_schemas().remove() calls alongside rollback_attached() in both the ROLLBACK opcode path (op_auto_commit) and the error rollback path (rollback_current_txn), matching the existing pattern in end_attached_write_txns. Reproducer: limbo_sim --seed 313 (also fixes seed 202) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The IntegrityCk instruction (used by PRAGMA integrity_check) was always using the main database pager, even when checking an attached database (e.g., PRAGMA aux0.integrity_check). This caused it to read pages from the main DB file instead of the attached DB file, leading to ShortRead errors or incorrect integrity results. Add a `db` field to IntegrityCk that carries the database index from translation, and resolve the correct pager via get_pager_from_database_index() at execution time. Found via simulator with ATTACH support enabled. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…OpenWrite When UPDATE changes the primary key (rowid) on an attached database table with a UNIQUE index, the PrebuiltEphemeralTable code path emitted OpenWrite instructions with database_id=0 (main) instead of the attached database's id. This caused "short read on page" errors (single connection) or "IdxDelete: no matching index entry found" (multi-connection). The bug had two locations: - main_loop.rs: PrebuiltEphemeralTable case used the ephemeral table's database_id instead of target_table.database_id for the table cursor - emitter.rs: target_database_id was derived from plan.table_references which contains the ephemeral table (database_id=0) after the optimizer replaces it, instead of using target_table.database_id Found via simulator seed 867. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…th_flags open_file_with_flags() would attempt to open the file and acquire an exclusive lock before checking the DATABASE_MANAGER registry for an already-open Database. When multiple connections ATTACH the same file, the second ATTACH would fail with "Failed locking file" because the first connection already holds the lock. Fix by checking the registry early: if the database is already open in-process, return the existing Arc<Database> without opening a new file handle. Found via simulator seed 37 (n=1000). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a statement writes to the main DB and reads from an attached DB, step_end_write_txn only released write locks on attached pagers via end_attached_write_txns, but never released read locks on attached pagers that only had read transactions. This caused subsequent statements to reuse a stale WAL snapshot on the attached DB, missing data committed by other connections. Add end_attached_read_txns calls to both Done paths in step_end_write_txn (initial commit and CommittingAttached re-entry). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SQLite reserves database index 0 for "main" and 1 for "temp"; attached databases start at index 2. The codebase had raw numeric comparisons like `database_id >= 2` and `db: 0` scattered across translate/ and vdbe/, which were unclear without knowing this convention. Add MAIN_DB_ID, TEMP_DB_ID, FIRST_ATTACHED_DB_ID constants and an is_attached_db() helper in core/lib.rs. Replace all magic numbers across 17 files. Remove the duplicate local constants from Resolver in emitter.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ator The differential simulator's SQLite schema cache workaround (pragma_user_version query) only invalidated the main database's schema. When ALTER TABLE DROP COLUMN was executed on an attached database, other rusqlite connections still saw the old column count, causing false divergences between limbo and rusqlite. Fix by also querying {db_name}.sqlite_master for each attached database, which forces SQLite to re-read the attached database's schema. Discovered via seed: 12173377563160143215 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

glommer requested review from jussisaurio, penberg and pereman2 as code owners February 21, 2026 13:39

turso-bot bot reviewed Feb 21, 2026

View reviewed changes

github-actions bot added simulator core translation/planning vdbe Storage labels Feb 21, 2026

glommer and others added 15 commits February 21, 2026 08:28

glommer force-pushed the attach-fixes-sim branch from 4c7cbf7 to e2487bc Compare February 21, 2026 14:29

glommer and others added 2 commits February 21, 2026 10:40

Merge branch 'main' into attach-fixes-sim

db77252

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Attach fixes sim#5509

Attach fixes sim#5509
glommer wants to merge 17 commits intotursodatabase:mainfrom
glommer:attach-fixes-sim

glommer commented Feb 21, 2026

Uh oh!

turso-bot bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

glommer commented Feb 21, 2026

Description

Motivation and context

Description of AI Usage

Uh oh!

turso-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant