feat: Add atuin store repair#3429
Conversation
Greptile SummaryAdds Confidence Score: 5/5Safe to merge; all findings are P2 style concerns. The security boundary (UPDATE scoped to user_id), idempotency guarantees, and index migration are all correct. The only flagged issue is a future-proofing concern about non-history tags that doesn't affect any current code path. crates/atuin-client/src/history/store.rs — the non-history-tag bail in Important Files Changed
Reviews (1): Last reviewed commit: "feat: Add atuin store repair" | Re-trigger Greptile |
ef748f1 to
82c8bfe
Compare
82c8bfe to
a02d2af
Compare
|
This pull request has been mentioned on Atuin Community. There might be relevant details there: https://forum.atuin.sh/t/key-management-user-experience/1442/4 |
a24f144 to
1f529e8
Compare
1f529e8 to
eba150d
Compare
I got bitten by the same issue describe by multiple users in this thread, not because sync is broken in any way, but because of a failed attempt to script the login to my atuin server.
I tried @ellie's proposed solution described here, but as I have atuin running on a lot of servers, all of which auto syncs, I deemed it too much work to disable auto sync, sync up and down on all of them and then turning on auto sync again. Every time I thought I had synced everything correctly, the undecryptable records turned up again, and the
atuin store push --forcemultiplied the issue due to the complete rewrite of the server database.So instead I wrote
atuin store repair.As I am not a rust expert, I had a bit of help from Claude.
I'm currently running this branch on my own server, and I have successfully repaired the history on all of my servers.
How it works
atuin store repairsurgically replaces the encrypted payload of each undecryptable record with a decryptable no-op — aHistoryRecord::Deletepointing at a freshly-minted UUID that does not match any real history entry.atuin store repair is idempotent at every level:
Re-running on the same host, same state
Step 1 scans the local store for records that fail to decrypt with the current key. After a successful repair, every record decrypts, so the bad set is empty and the command exits with "Nothing to repair." No server traffic, no local writes.
Re-running after another host already fixed the server
When host A repairs first, the server holds the good replacement. When host B runs repair, fetch_server_view pulls that record, resolve_replacement takes the ServerHadClean branch, and B overwrites its local copy from the server. B pushes nothing — to_push stays empty for records already fixed remotely.
Re-running mid-repair after a crash
The server UPDATE is keyed on (user_id, client_id) and replaces data/cek wholesale. Applying the same replacement twice yields the same row state — the second UPDATE just writes identical bytes. Locally, delete(id) + push_batch(replacement) likewise converges: the row either didn't exist yet (insert) or gets dropped and re-inserted with the same contents.
The Delete no-op itself is idempotent
The replacement payload is HistoryRecord::Delete(random_uuid). When any host processes this during incremental_build, it calls database.delete_rows([random_uuid]) — deleting a UUID that doesn't match any history row is a no-op, and deleting it repeatedly remains a no-op.
What preserves idempotency
On each host repair does the following:
repaired it), adopt that one locally.
The first host to run repair fixes the server; subsequent hosts just pull the fix down. --local-only skips the server round trip for offline use.
Changes introduced
POST /api/v0/record/repairatuin store repairScope
Checks