Fix sqlite3 "database is locked" in pipelined shards cache by jezdez · Pull Request #926 · conda/conda-libmamba-solver

jezdez · 2026-04-30T11:55:17Z

Description

Fixes #924.

The pipelined shard traversal runs a cache_fetch_thread (reads) and a network_fetch_thread (writes) against separate sqlite3.Connections to the same repodata_shards.db. With the default rollback journal, a reader cannot proceed while a writer holds the exclusive lock. Under CI load the 5 s default busy timeout expires and SQLite raises sqlite3.OperationalError: database is locked.

This PR applies two changes to shards_cache.connect():

Enable WAL mode (PRAGMA journal_mode = WAL), which allows readers to proceed against a snapshot while a writer is appending. The pragma is wrapped in a try/except sqlite3.DatabaseError so it degrades gracefully on locking-hostile filesystems (see Test sharded repodata on locks-hostile filesystem #891) or corrupt databases.
Bump the connection timeout to 30 s (from the 5 s default), giving the busy handler more headroom when WAL is unavailable.
When WAL is confirmed active, also set PRAGMA synchronous = NORMAL (sufficient for a cache, avoids unnecessary fsync).

The dev script dev/scripts/requests-fetch-all-shards.py already used these same pragmas; this brings the production connect() in line.

Checklist - did you ...

Add a file to the news directory (using the template) for the next release's release notes?
Add / update necessary tests?
~~Add / update outdated documentation?~~

Enable WAL journal mode and a 30s busy timeout on repodata_shards.db so the cache reader thread no longer races with the network writer thread. Falls back gracefully on filesystems where WAL is unsupported.

dholth

I'm surprised this came up, since (I thought) that we had short transactions in the shard cache system.

I've considered a different solution. It involves giving the network thread a copy of the incoming queue to the cache thread; or connecting them through the main thread which would receive bytes, and forward them to the cache thread. Then the cache thread would work through its queue, either looking up or storing requests as they came in.

dholth · 2026-04-30T12:04:05Z

    conn.row_factory = sqlite3.Row
    with conn as c:
+        try:
+            mode = c.execute("PRAGMA journal_mode = WAL").fetchone()[0]


I'm a big fan of WAL mode. It could fail not because of filesystem locks, which are used by all sqlite3 modes, but because shared memory is not available (if conda's cache is on a shared filesystem, accessed by two computers). conda-index users requested we drop WAL mode for this reason.
On the other hand, this is only a cache; and there are lots of reasons why conda might not work if two conda's try to use the same cache concurrently.

I notice that we turn foreign_keys = ON but we only have one table and no foreign keys. Probably good practice anyway.

Yeah, that's fine, I think. I remembered that PR. I think in this case we need it.

Co-authored-by: Daniel Holth <dholth@anaconda.com>

dholth · 2026-04-30T12:29:36Z

#927 is an outline of what a queue for cache insertion could look like

dholth

Thanks

Fix sqlite3 "database is locked" in pipelined shards cache (#924)

0cace39

Enable WAL journal mode and a 30s busy timeout on repodata_shards.db so the cache reader thread no longer races with the network writer thread. Falls back gracefully on filesystems where WAL is unsupported.

jezdez requested a review from a team as a code owner April 30, 2026 11:55

conda-bot added this to 🔎 Review Apr 30, 2026

github-project-automation Bot moved this to 🆕 New in 🔎 Review Apr 30, 2026

conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Apr 30, 2026

jezdez requested review from danyeaw and dholth April 30, 2026 12:01

jezdez mentioned this pull request Apr 30, 2026

sqlite3.OperationalError: database is locked in pipelined shards cache_fetch_thread #924

Closed

2 tasks

dholth approved these changes Apr 30, 2026

View reviewed changes

github-project-automation Bot moved this from 🆕 New to ✅ Approved in 🔎 Review Apr 30, 2026

Update news/924-fix-shards-db-locked

fd39c4a

Co-authored-by: Daniel Holth <dholth@anaconda.com>

jezdez enabled auto-merge (squash) April 30, 2026 12:16

jezdez mentioned this pull request Apr 30, 2026

Avoid sqlite3 lock contention by serializing access through the cache thread. #927

Open

3 tasks

dholth approved these changes Apr 30, 2026

View reviewed changes

Merge branch 'main' into fix/924-shards-db-locked

7184cbd

jezdez merged commit 9f3f73b into main Apr 30, 2026
75 checks passed

jezdez deleted the fix/924-shards-db-locked branch April 30, 2026 20:26

github-project-automation Bot moved this from ✅ Approved to 🏁 Done in 🔎 Review Apr 30, 2026

This was referenced May 2, 2026

Update to 26.4.1 AnacondaRecipes/conda-libmamba-solver-feedstock#36

Open

Revert canary tests #934

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sqlite3 "database is locked" in pipelined shards cache#926

Fix sqlite3 "database is locked" in pipelined shards cache#926
jezdez merged 3 commits intomainfrom
fix/924-shards-db-locked

jezdez commented Apr 30, 2026 •

edited

Loading

Uh oh!

dholth left a comment

Uh oh!

dholth Apr 30, 2026

Uh oh!

dholth Apr 30, 2026

Uh oh!

jezdez Apr 30, 2026

Uh oh!

Uh oh!

dholth commented Apr 30, 2026

Uh oh!

dholth left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jezdez commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist - did you ...

Uh oh!

dholth left a comment

Choose a reason for hiding this comment

Uh oh!

dholth Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

dholth Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

jezdez Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dholth commented Apr 30, 2026

Uh oh!

dholth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jezdez commented Apr 30, 2026 •

edited

Loading