Skip to content

Migrate cache backend from diskcache to pysciqlop-cache#289

Open
jeandet wants to merge 1 commit into
SciQLop:mainfrom
jeandet:migrate-to-sciqlop-cache
Open

Migrate cache backend from diskcache to pysciqlop-cache#289
jeandet wants to merge 1 commit into
SciQLop:mainfrom
jeandet:migrate-to-sciqlop-cache

Conversation

@jeandet
Copy link
Copy Markdown
Member

@jeandet jeandet commented May 4, 2026

Summary

  • Swap diskcachepysciqlop-cache in cache.py and speasy_index.py. APIs are near-identical; the few differences (kwarg renames, stats() returning a dict, no close()/RAII, FanoutCache.transact() taking a shard key) are handled in the wrapper.
  • Auto-migrate legacy diskcache directories on first launch via pysciqlop_cache.migrate. The new cache is staged in a sibling directory and only swapped in on success; the legacy data is preserved at <path>.diskcache.backup for the user to verify and delete.
  • Fix three pre-existing issues in _request_locker.py while we're here, leveraging the new add(expire=, tag=) primitive:
    • Stale PendingRequest entries from crashed workers now auto-expire instead of leaking until the next attempt picks them up.
    • Peers wake up promptly when the producer drops the lock (poll key in cache) instead of always sleeping the full timeout. Regression test added (RequestLockerWakeup).
    • Pending markers carry a pending_request tag so a proxy/server can drop them all on startup via evict_pending_requests().
  • Fix latent FanoutCache.transact() no-op: Cache.transact() now accepts an optional shard key and forwards it; _providers_caches.py callers pass product as the key.

Why pysciqlop-cache

  • Crash-safe atomic add(key, value, expire=, tag=) — no separate lock + check + set dance.
  • Per-shard FanoutCache.transact(key) — the SciQLop proxy plans to use FanoutCache for ~4 TB-scale storage where SQLite write contention spreading and per-shard recovery are valuable.
  • Round-2 concurrency audit: atomic incr, evict_tag, reentrant transactions, fixed counter races. See pysciqlop-cache v0.1.0 release notes.
  • Hybrid blob/file storage at 8 KB threshold: large CDF chunks land as files, SQLite stays small.

Migration path for users

On first launch with the new code, Speasy detects the legacy layout (cache.db in the cache directory) and migrates automatically. Mechanics:

  1. Migrate the original diskcache → sibling staging directory.
  2. On success: rename original → <path>.diskcache.backup, then staging → original.
  3. On any failure: original is left untouched, partial staging is cleaned up.

The migration tool requires diskcache to be importable. We keep it in requirements_dev.txt for development; users upgrading from a previous Speasy install already have it installed in their environment, so the auto-migration just works. Fresh installs that never had a Speasy cache need nothing.

For very large caches the migration takes minutes; it's logged at WARNING so users know what's happening.

Test plan

  • tests/test_cache.py — 232 passed (9:32 wall clock)
  • Manually verified migration end-to-end on a real 21 GB diskcache (32 636 entries in 63 s).
  • Smoke test: fresh Cache create/set/get/transact/stats/lock/contains/drop.
  • CI to confirm on supported Python versions (3.9–3.14).
  • Optional: deploy to a proxy environment to exercise the FanoutCache path before tagging a release.

closes #275

Swap the cache and index backends to pysciqlop-cache (>=0.1.0). It has
identical semantics for everything Speasy uses, plus crash-safe atomic
add(expire=, tag=), per-shard FanoutCache transactions, and a built-in
diskcache-compatible migrator.

Migration is automatic on first launch: if a legacy diskcache layout is
detected, the cache is staged in a sibling directory and only swapped in
once the new sciqlop-cache is fully written; the old data is preserved
at <path>.diskcache.backup so users can verify before deleting.

While here, fix three issues in the request deduplication code that the
new backend makes easy:

- Stale PendingRequest entries on worker crash now auto-expire via
  add(expire=timeout) instead of leaking until the next attempt.
- Peers waiting on an in-flight request wake up promptly when the
  producer finishes (poll `key in cache`) instead of always sleeping the
  full timeout. Reproducer test added.
- Pending markers are tagged so a proxy/server can drop them all on
  startup with evict_pending_requests().

The latent FanoutCache.transact() no-op is also fixed: Cache.transact()
now takes an optional shard key and forwards it correctly. Callers in
_providers_caches.py pass the product as shard key.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 4, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
C Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch to SciQLop cache

1 participant