`BlueprintSledConfig`: One generation number for sled-agent instead of three #7788

jgallagher · 2025-03-12T20:43:23Z

This is the first half of #7774. I naively expected to do the blueprint side and the sled-agent side together, but the blueprint side alone is pretty big already, so here it is.

The primary changes are in nexus-types:

Remove BlueprintZonesConfig, BlueprintDatasetsConfig, and BlueprintPhysicalDisksConfig; each of these was a generation number + an appropriate IdMap
OmicronSledConfig now contains a single generation number and three IdMaps

and everything else is fallout from that. Most of the changes are mechanical (and I highly recommend viewing the diff ignoring whitespace; it cuts a few hundred +/- lines out). The bits that are less mechanical and deserve a closer view are:

Database schema changes
Blueprint diff changes (bikeshedding welcome on how to present the single config generation number)

One decision I think is fine but is also worth double checking: as of this PR, we still have to send 3 generation numbers to sled-agent in OmicronSledConfig, so I'm replicating the single BlueprintSledConfig generation for all three of the sub-configs. I think this is fine; in practice if deployed, it means a sled might see its disks or datasets generation number jump a few units up without any changes to the actual disks or datasets, but I believe that's effectively a no-op other than rewriting the ledger JSON.

jgallagher · 2025-03-13T14:31:45Z

I think this is fine; in practice if deployed, it means a sled might see its disks or datasets generation number jump a few units up without any changes to the actual disks or datasets, but I believe that's effectively a no-op other than rewriting the ledger JSON.

I think this is actually not fine; the helios/deploy test failure looks like another instance of #7622 (albeit for a different reason):

% rg 'physical disks ensure' oxide-sled-agent_default.log | looker
21:37:56.429Z INFO SledAgent: physical disks ensure
    file = sled-agent/src/sled_agent.rs:895
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Updated storage
    file = sled-agent/src/sled_agent.rs:899
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Propagating new generation of disks
    file = sled-agent/src/sled_agent.rs:925
    generation = Generation(2)
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Updated storage monitor
    file = sled-agent/src/sled_agent.rs:930
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Updated zone bundler
    file = sled-agent/src/sled_agent.rs:935
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Updated probes
    file = sled-agent/src/sled_agent.rs:941
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:00.195Z INFO SledAgent: physical disks ensure: Updated instances
    file = sled-agent/src/sled_agent.rs:946
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:12.714Z INFO SledAgent: physical disks ensure
    file = sled-agent/src/sled_agent.rs:895
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:12.715Z INFO SledAgent: physical disks ensure: Updated storage
    file = sled-agent/src/sled_agent.rs:899
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4
21:38:12.715Z INFO SledAgent: physical disks ensure: Propagating new generation of disks
    file = sled-agent/src/sled_agent.rs:925
    generation = Generation(3)
    sled_id = 019961bd-caf2-4093-bb29-adcee288a2a4

I suspect that when the generation is bumped without any of the disks actually changing, the new generation is not propagated out and so we get stuck waiting for it here just like after expunging a nonexistent disk:

omicron/sled-agent/src/sled_agent.rs

Lines 927 to 930 in 3802ae4

    
           // Ensure that the StorageMonitor, and the dump devices, have committed 
        
           // to start using new disks and stop using old ones. 
        
           self.inner.storage_monitor.await_generation(*our_gen).await?; 
        
           info!(self.log, "physical disks ensure: Updated storage monitor");

jgallagher · 2025-03-13T15:01:58Z

I think this is actually not fine; the helios/deploy test failure looks like another instance of #7622 (albeit for a different reason)

FWIW: I think it's fine to review this PR as-is. I'm going to work on fixing #7622 separately, and once that lands I'm expecting the deploy test will pass.

andrewjstone

Looks great. Love the simplification.

andrewjstone · 2025-03-13T15:15:51Z

nexus/types/src/deployment.rs

@@ -371,17 +388,13 @@ impl BpTableData for BlueprintZonesConfig {

 // Useful implementation for printing two column tables stored in a `BTreeMap`, where
 // each key and value impl `Display`.
-impl<S1, S2> BpTableData for (Generation, &BTreeMap<S1, S2>)


andrewjstone · 2025-03-13T15:25:24Z

nexus/types/src/deployment/blueprint_diff.rs

-            .sleds
-            .added
-            .values()
-            .fold(0, |acc, c| acc + c.zones_config.zones.len())


It's so nice to get rid of those intermediate types. This reads much better now.

andrewjstone · 2025-03-13T15:28:56Z

nexus/types/src/deployment/blueprint_diff.rs

        }
    }
 }

 #[derive(Debug)]
 pub struct DiffPhysicalDisksDetails {
-    // Newly added sleds don't have "before" disks


These generations were the most annoying things to work with. Very happy to get rid of them and simplify all the callers above.

schema/crdb/dbinit.sql

andrewjstone · 2025-03-13T15:39:05Z

schema/crdb/bp-sled-agent-generation/up08.sql

@@ -0,0 +1,2 @@
+ALTER INDEX IF EXISTS omicron.public.bp_sled_metadata @ bp_sled_state_pkey


…ources` (#7793) This isn't a very big PR, but I'm not super familiar or comfortable with the pieces surrounding these changes, so please review carefully. Rationale for these changes follow. --- In both #7622 and #7788 (comment), we see `omicron_physical_disks_ensure` get stuck (apparently indefinitely) here: https://github.com/oxidecomputer/omicron/blob/3802ae4736a26b25b66be028c0a2217f6428a722/sled-agent/src/sled_agent.rs#L927-L930 I believe this can happen when there's a generation number change to the disks config but the corresponding execution of `StorageResources::synchronize_disk_management()` does not set `updated` to true, which means it [will not send the new disk config on the `disks_updated` channel](https://github.com/oxidecomputer/omicron/blob/3802ae4736a26b25b66be028c0a2217f6428a722/sled-storage/src/resources.rs#L403-L405). This leaves `StorageMonitor::await_generation()` stuck [waiting for an update that is never coming](https://github.com/oxidecomputer/omicron/blob/8222b27e850d2c3adc496222a317fe5ff96d1289/sled-agent/src/storage_monitor.rs#L59-L64). Both of the above issues are examples of such paths: 1. In #7622, if the new config removes a disk that is physically not present, we [do not set `updated` to true](https://github.com/oxidecomputer/omicron/blob/3802ae4736a26b25b66be028c0a2217f6428a722/sled-storage/src/resources.rs#L331-L340). 2. In #7788 (comment), if the generation is increased but there have been no changes to the disk config, we never set `updated` to true. I think there are two design choices in `StorageResources` that make this difficult to reason about, and this PR attempts to address them both. 1. `StorageResources` holds both [`disks: AllDisks` and `disk_updates: watch::Sender<AllDisks>`](https://github.com/oxidecomputer/omicron/blob/3802ae4736a26b25b66be028c0a2217f6428a722/sled-storage/src/resources.rs#L221-L231), and allows them to diverge at times and has to choose when to sync them back up. This PR removes `disk_updates` and changes `disks` to be the `watch::Sender<AllDisks>`, removing the possibility of us having two different `AllDisks` that have diverged in contents. (We still have to make local clones of the `AllDisks` via `Arc::make_mut()`, so we temporarily have multiple copies that intentionally diverge, but they're local and dropped when the methods return.) I tried to make it so we didn't need to keep track of `updated` manually. 2. `StorageResources::set_config()` [updates the generation held inside `self.disks`](https://github.com/oxidecomputer/omicron/blob/3802ae4736a26b25b66be028c0a2217f6428a722/sled-storage/src/resources.rs#L274), but neither sends that information along the `disk_updates` channel nor informs `synchronize_disk_resources` that the generation has changed. This seems particularly dicey: after `set_config` returns, we've increased the generation inside `self.disks` but have not actually acted on the new config from that generation, so inspecting `self.disks` at that point would be very confusing. This PR changes `set_config` to record the generation "on the side" (i.e., not updating the generation inside `self.disks`), and defers updating the generation inside `self.disks` to `synchronize_disk_resources`. I tested this on a4x2 by manually performing a "spurious disk config generation bump", and confirmed sled-agent did not get wedged as it does on `main`. --- Fixes #7622.

jgallagher · 2025-03-17T14:29:08Z

I think this is actually not fine; the helios/deploy test failure looks like another instance of #7622 (albeit for a different reason)

FWIW: I think it's fine to review this PR as-is. I'm going to work on fixing #7622 separately, and once that lands I'm expecting the deploy test will pass.

This was fixed by #7793, and indeed after merging with main with that landed, the tests on this PR pass with no other changes.

jgallagher added 11 commits March 12, 2025 15:42

BlueprintSledConfig: one generation

51532be

update sled-agent

02d5900

update planner

e18b9a3

blippy tests

d35001f

cargo fmt

2674468

update db-model and db-queries

97c2e9b

fix execution tests

1183d1d

update nexus bg task tests

6366c06

db migration

c789889

expectorate blueprint display/diff changes

8508c24

nexus openapi

862ab95

jgallagher requested review from andrewjstone, davepacheco and smklein March 12, 2025 20:43

andrewjstone approved these changes Mar 13, 2025

View reviewed changes

jgallagher mentioned this pull request Mar 13, 2025

sled-agent: Avoid PUT /omicron-config getting wedged by StorageResources #7793

Merged

jgallagher added 3 commits March 14, 2025 12:11

Merge branch 'main' into john/bp-sled-config-one-gen

8af4f8c

typo

b2d4dac

Merge branch 'main' into john/bp-sled-config-one-gen

0dd13f2

jgallagher merged commit 0c690f8 into main Mar 17, 2025
18 checks passed

jgallagher deleted the john/bp-sled-config-one-gen branch March 17, 2025 14:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`BlueprintSledConfig`: One generation number for sled-agent instead of three #7788

`BlueprintSledConfig`: One generation number for sled-agent instead of three #7788

jgallagher commented Mar 12, 2025

jgallagher commented Mar 13, 2025

jgallagher commented Mar 13, 2025

andrewjstone left a comment

andrewjstone Mar 13, 2025

andrewjstone Mar 13, 2025

andrewjstone Mar 13, 2025

andrewjstone Mar 13, 2025

jgallagher commented Mar 17, 2025

		@@ -0,0 +1,2 @@
		ALTER INDEX IF EXISTS omicron.public.bp_sled_metadata @ bp_sled_state_pkey

BlueprintSledConfig: One generation number for sled-agent instead of three #7788

BlueprintSledConfig: One generation number for sled-agent instead of three #7788

Conversation

jgallagher commented Mar 12, 2025

jgallagher commented Mar 13, 2025

jgallagher commented Mar 13, 2025

andrewjstone left a comment

Choose a reason for hiding this comment

andrewjstone Mar 13, 2025

Choose a reason for hiding this comment

andrewjstone Mar 13, 2025

Choose a reason for hiding this comment

andrewjstone Mar 13, 2025

Choose a reason for hiding this comment

andrewjstone Mar 13, 2025

Choose a reason for hiding this comment

jgallagher commented Mar 17, 2025

`BlueprintSledConfig`: One generation number for sled-agent instead of three #7788

`BlueprintSledConfig`: One generation number for sled-agent instead of three #7788