Locking and requirements for applications #98
-
Thanks for such a neat project! I recently added litestream to an existing Drone (single instance Go app) setup in the home lab and wanted to ask about some things I learned. For background, Drone (the "app") runs in a container with a local SQLite db, originally backed by a Kubernetes persistent disk, but now able to just use a litestream init container and sidecar to restore from S3 and replicate to S3. Since litestream was added, the app can perform all its duties (serve auth'd UI, run some pipelines), but many actions fail at random unless manually retried. The app quirks (e.g. user is logged out, job didn't start) match app debug logs showing operations that fail because "database is locked". Disabling litestream replicate alleviates these, as does rolling back the persistent disk. So my question is about locking behaviors and the expectations on applications (discussed a bit in #58). Ideally, apps should be retrying operations. But I suspect many single-process apps don't expect other locks. I found litestream takes out a write lock at one point with a note and suspected it could be a factor, but I don't know the background. Is it fundamental that litestream write-lock? Basically, the use case I'm curious about is for apps that maybe don't do SQLite retries well, can litestream be as stealthy as possible, grab the lock infrequently, not use a write lock, etc. Or I'd tradeoff "realtime" for "just sync every few minutes is fine", if it meant the application was less aware its SQLite was being touched. PS: I've explored some other avenues like recompiling Drone with a recent mattn/go-sqlite3 to match litestream's in case there are missing fixes, but no difference. And manually chekcing SQLite is in WAL mode and that locking mode is normal, not exclusive. But for now I've rolled back to using a persistent disk, the quirks from lock contention were too much. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Hi @dghubble, thanks for the feedback. The locking mechanism turns out to be a historical artifact of how Litestream worked originally and I should be able to remove it. I wrote up an issue for it (#99) but the tl;dr is that Litestream relied on SQLite to validate transaction boundaries in the WAL originally but that validation has been moved into Litestream so the lock isn't necessary anymore. I added the issue to the v0.3.3 release that I'm hoping to get out by the end of the week. That should make for a much more pleasant experience with applications that don't use a busy timeout. For Finally, there is a |
Beta Was this translation helpful? Give feedback.
Hi @dghubble, thanks for the feedback. The locking mechanism turns out to be a historical artifact of how Litestream worked originally and I should be able to remove it. I wrote up an issue for it (#99) but the tl;dr is that Litestream relied on SQLite to validate transaction boundaries in the WAL originally but that validation has been moved into Litestream so the lock isn't necessary anymore.
I added the issue to the v0.3.3 release that I'm hoping to get out by the end of the week. That should make for a much more pleasant experience with applications that don't use a busy timeout.
For
mattn/go-sqlite3
, you can specify a_busy_timeout=1000
in the connection string and that should fix it…