Fix encrypted fresh DB page-size retargeting#7387
Conversation
|
Good morning @LeMikaelF, @penberg, @pereman2 👋 is there anything I can do to get this PR reviewed? |
|
@aalhour I don't know if you're a human or a bot, or some hybrid. To me, this looks like a one-shot of a coding agent, and even your last comment on this PR looks AI-generated. The code comments contain slop-sounding phrases that are nowhere else in the codebase, like "reserved-tail slicing", "layout repair" and "retargetting". These are all red flags to us, usually indicating that there was no human in the loop. |
|
@aalhour We welcome thoughtful contributions, and if you're a human willing to contribute, you're welcome. But see https://turso.tech/blog/what-happens-with-oss-in-the-age-of-ai for notes on the interaction of AI and OSS contribution. |
|
Hello @LeMikaelF, I am not sure how my last comment sounded AI generated, I thought of pinging you and the reviewers after the weekend since the last time we interacted was on Friday on the GitHub issue. My website and LinkedIn profile are attached to my GitHub profile, the same links are on my LinkedIn profile as well. The commits were signed by my public SSH key from my personal machine. If it helps with anything, here are my previous contributions to turso: https://github.com/tursodatabase/turso/pulls?q=is%3Apr+author%3Aaalhour+is%3Aclosed |
| // Regression coverage for https://github.com/tursodatabase/turso/issues/7375. | ||
| // PRAGMA cipher/hexkey before PRAGMA page_size used to leave EncryptionContext | ||
| // pinned to the default 4096-byte page size; the first write then panicked. | ||
|
|
||
| const ISSUE_7375_KEY_256: &str = "000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f"; | ||
| const ISSUE_7375_KEY_128: &str = "000102030405060708090a0b0c0d0e0f"; |
There was a problem hiding this comment.
this is extra noise and the comment is not relevant to the keys
you can just use the existing CIPHER_A / KEY_A etc.
|
@aalhour The patch is correct and looks okay, though i would make slight modifications however, it is not just AI assisted PR but rather fully generated from it. the comments are verbose and AI ish. I ask you to improve the comments |
| /// configuration: it must only be called while the pager is uninitialized, | ||
| /// and never to change cipher or key. Returns true when a context was | ||
| /// present and updated; false when no encryption context is installed. | ||
| pub(crate) fn retarget_encryption_page_size(&mut self, page_size: PageSize) -> bool { |
There was a problem hiding this comment.
I would prefer calling it as reset_page_size_in_encryption_ctx
| let retargeted = { | ||
| let mut io_ctx = self.io_ctx.write(); | ||
| io_ctx.retarget_encryption_page_size(size) | ||
| }; | ||
| if !retargeted { | ||
| return Ok(()); | ||
| } | ||
| let Some(wal) = self.wal.as_ref() else { | ||
| return Ok(()); | ||
| }; | ||
| wal.set_io_context(self.io_ctx.read().clone()); |
There was a problem hiding this comment.
Add a method on Pager::reset_page_size_in_encryption_ctx
within it, you can modify the ctx if the page size is non default. Then you can set the WAL ctx too
Description
This PR fixes encrypted database creation when the encryption context is initialised before the final database page size is selected.
Previously,
Pager::set_encryption_contextcaptured the current pager page size intoEncryptionContext. For a fresh database, that value can still be the default4096. IfPRAGMA page_sizelater changes the initial page size, the pager page size becomes e.g.512, but the encryption context still asserts/slices using4096, causing encrypted writes to panic.This PR retargets the encryption context page-size field from
Pager::set_initial_page_size, so all fresh-database page-size setup paths keep pager layout state and encryption layout state aligned.Why this fix
The fix is intentionally placed at the pager boundary where the invariant changes:
That covers all known reproducing surfaces:
PRAGMA cipher; PRAGMA hexkey; PRAGMA page_size=...ATTACHVACUUMon encrypted non-4K databasesThe encryption key and cipher mode are not rebuilt. Only the cached page-size value inside
EncryptionContextis updated, because the page size is used for page slicing/assertions and is independent of key derivation.No page-cache clear is needed here because
set_initial_page_sizealready requires the pager to be uninitialised.Alternatives considered
1) Fix only
Connection::reset_page_sizeThis was the smallest initial option, but it only covers the SQL PRAGMA path. It misses callers that set the initial page size directly on the pager, including fresh encrypted
ATTACHand VACUUM-related pager setup.That would fix the issue reproducer while leaving the same stale-context invariant reachable through other code paths.
2) Rebuild the full encryption context
Another option was to rebuild/reapply the full encryption context whenever the page size changes. That has a larger blast radius because it touches key/cipher setup semantics and risks introducing subtle differences in encryption initialisation order.
The bug does not require key material or cipher state to change. The stale field is only the page-size cache, so a narrow retarget is safer.
3) Remove
page_sizefromEncryptionContextLonger term,
EncryptionContextcould avoid storing page size and instead accept it from the caller at each encrypt/decrypt operation. That would make the invariant harder to violate by construction.I did not choose that here because it would require changing the encryption API surface and all call sites that rely on the current context shape. That is a broader refactor with more regression risk than needed for this bug fix.
Tests
Added regression coverage for:
PRAGMA page_sizePRAGMA page_sizeATTACHwith inherited non-4K page sizeVACUUMon encrypted non-4K databaseAlso verified nearby encryption, ATTACH, and VACUUM integration tests.
Motivation and context
Issue: #7375
Description of AI Usage
Same as previous contributions, used Codex for code & path discovery.