-
Notifications
You must be signed in to change notification settings - Fork 45
improved TUF artifact replication robustness #7519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// This is the equivalent of applying `#[serde(transparent)]`, but that has a | ||
// side effect of changing the JsonSchema derive to no longer emit a schema. | ||
impl Serialize for Generation { | ||
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> | ||
where | ||
S: serde::Serializer, | ||
{ | ||
self.0.serialize(serializer) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to call out this change -- I believe there is a bug in progenitor 0.9.x where newtype structs that do not have #[serde(transparent)]
cannot be serialized in a query string, and I need to go file an issue for it. But I think in practice it is more accurate to manually implement Serialize
in Omicron. In practice this change does not affect existing JSON serialization because serde_json treats newtype structs as their inner value.
…n-generation-numbers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only taken a look at the docs so far, but looks solid! Thanks for writing that up. I'll finish the review later or tomorrow.
) -> Result<Inventory> { | ||
) -> Result<(ArtifactConfig, Inventory)> { | ||
let generation = | ||
self.datastore.update_tuf_generation_get(opctx).await?; | ||
let mut inventory = Inventory::default(); | ||
let mut paginator = Paginator::new(SQL_BATCH_SIZE); | ||
while let Some(p) = paginator.next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we call this, the generation can change out underneath, or new artifacts can be added but we will already have read the old generation. This stems from the fact that the generation is not coupled to any set of artifacts and so you don't know in the database what artifact a generation is tied to. They are updated independently and read independently. There needs to be some kind of logical mapping exposed in the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you logically map a generation number to deleted artifacts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be possible to get the artifact list and the generation number simultaneously via a JOIN
or a transaction (I'm not sure, I don't dabble in CRDB consistency much...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this was based on my faulty understanding above where I didn't see that writes were in a transaction. I think you can slap the generation read and pagination in a transaction and this should solve the issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided that putting pagination within a transaction seemed arduous. So, I have added a logical mapping with a generation_added
column. This will ensure reading the list of artifacts in a generation will be consistent even if another generation is added in between pages.
The plan is still to delete artifact rows when all the repos referencing them are deleted, but we can change that plan and add a generation_deleted
column later instead, too.
I could use an extra (few) sets of eyes looking at the modified implementation of DataStore::update_tuf_repo_insert
(more specifically the insert_impl
function). In particular, we start the transaction by fetching the current generation and selecting the new generation, and filling in the generation_added
field in all the artifacts:
omicron/nexus/db-queries/src/db/datastore/update.rs
Lines 181 to 187 in 41c5d11
// Load the current generation from the database and increment it, then | |
// use that when creating the `TufRepoDescription`. If we determine there | |
// are any artifacts to be inserted, we update the generation to this value | |
// later. | |
let old_generation = get_generation(&conn).await?; | |
let new_generation = old_generation.next(); | |
let desc = TufRepoDescription::from_external(desc.clone(), new_generation); |
Then if we determine new artifacts are to be inserted, we put the new generation number:
omicron/nexus/db-queries/src/db/datastore/update.rs
Lines 311 to 325 in 41c5d11
if !new_artifacts.is_empty() { | |
// Since we are inserting new artifacts, we need to bump the | |
// generation number. | |
debug!(log, "setting new TUF repo generation"; | |
"generation" => new_generation, | |
); | |
put_generation(&conn, old_generation.into(), new_generation.into()) | |
.await?; | |
// Insert new artifacts into the database. | |
diesel::insert_into(dsl::tuf_artifact) | |
.values(new_artifacts) | |
.execute_async(&conn) | |
.await?; | |
} |
Which will only update the generation if it's currently the old generation, and returns an error if no rows were updated:
omicron/nexus/db-queries/src/db/datastore/update.rs
Lines 373 to 389 in 41c5d11
async fn put_generation( | |
conn: &async_bb8_diesel::Connection<crate::db::DbConnection>, | |
old_generation: nexus_db_model::Generation, | |
new_generation: nexus_db_model::Generation, | |
) -> Result<nexus_db_model::Generation, DieselError> { | |
use db::schema::tuf_generation::dsl; | |
// We use `get_result_async` instead of `execute_async` to check that we | |
// updated exactly one row. | |
diesel::update(dsl::tuf_generation.filter( | |
dsl::singleton.eq(true).and(dsl::generation.eq(old_generation)), | |
)) | |
.set(dsl::generation.eq(new_generation)) | |
.returning(dsl::generation) | |
.get_result_async(conn) | |
.await | |
} |
I'm not 100% sure this is the right way to do this; if the generation number is incremented by another transaction first, I would prefer to retry this transaction than return an unretryable error. I don't understand enough about whether CockroachDB would detect this as a transaction conflict and tell us to retry it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this strategy a lot and think it should work with the serializable constraints of the DB. Thanks for adding this support!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use an extra (few) sets of eyes looking at the modified implementation of DataStore::update_tuf_repo_insert (more specifically the insert_impl function). In particular, we start the transaction by fetching the current generation and selecting the new generation, and filling in the generation_added field in all the artifacts:
I took a pretty close look and it looks great AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure this is the right way to do this; if the generation number is incremented by another transaction first, I would prefer to retry this transaction than return an unretryable error. I don't understand enough about whether CockroachDB would detect this as a transaction conflict and tell us to retry it.
I don't think CRDB will retry for us, but I'm not really sure. Would it be worth adding a test for this?
…n-generation-numbers
This most recent push only includes a merge from main and some of the docs nits; I'm going to be working on writing the generation number out to a ledger on the filesystem and making the artifact list query more consistent. |
I think the only thing outstanding here is how well the transaction retries based on how the query is currently written. I'm tempted to open an issue to track resolving that since this PR is pretty long-lived now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it. Thanks for all the hard work on this @iliana!
Sounds good to me. |
Closes #7399.
Nexus now owns and maintains a generation number for the set of artifacts the system wants to be fully replicated, which is used by Sled Agent to prevent conflicts. The generation number is stored in a new singleton table based on the existing db_metadata singleton. I wrote up
docs/tuf-artifact-replication.adoc
to provide a top-level overview of the system and some of the conflicts that this refactor seeks to prevent.The Sled Agent artifact store APIs are modified. Two new APIs exist for getting and putting an "artifact configuration", which is the list of wanted artifacts and its associated generation number. The list request returns the current generation number as well, and the PUT and "copy from depot" requests require an up-to-date generation number in the query string. The delete API is removed in favor of Sled Agent managing deletions on its own whenever the configuration is updated.