Problem
commonware-runtime's tokio::Config exposes a single, immutable-after-construction storage_directory, and every Storage backend resolves all paths underneath it. Concretely:
runtime/src/tokio/runtime.rs — Config { storage_directory: PathBuf, .. } with with_storage_directory(p) and storage_directory() accessors. The field is consumed when Storage is started (runtime.rs:386-392 passes self.cfg.storage_directory.clone() into IoUringConfig). After start(), the root is fixed for the lifetime of the runtime.
runtime/src/storage/tokio/mod.rs — Config { storage_directory: PathBuf, maximum_buffer_size: usize }. Every open_versioned / scan / remove resolves through self.cfg.storage_directory.join(partition).join(hex(name)) (mirrored in runtime/src/storage/iouring.rs at :115 for open_versioned and :185/:205 for remove/scan).
runtime/src/storage/mod.rs:181-194 — validate_partition_name rejects any character that isn't alphanumeric, -, or _. So a subdirectory hop (/) cannot be smuggled inside a partition name to redirect a particular partition to a different root.
Runner::start builds and owns a tokio runtime. Calling Runner::new(cfg).start(..) twice in one process — once per intended root — is invalid inside an already-running runtime.
Net result: every Storage-backed partition the application opens (block journals, archives, marshal metadata, consensus journals, and DKG share material) MUST live under one filesystem root. There is no API to peel off a single high-trust partition onto a separate volume.
Use case
Production consensus operators have different durability classes for different state:
- Frequently rotated / rebuildable data (block journals, archives, sync state) on a large, recyclable volume that can be wiped and restored from peers.
- Rare-write, irreplaceable secrets (BLS shares, DKG output, signing keys) on a small, snapshot-backed volume with a different ops policy.
Today these must share one filesystem root. A maintenance operation on the recyclable volume — a Kubernetes kubectl delete pvc, a snapshot-roll, an ops-driven re-init — wipes the secret material as collateral damage. In a BLS threshold cluster with n validators and threshold t = 2f+1, two such collateral-damage events within f of each other crosses the BFT bound and wedges consensus until a fresh DKG ceremony is run.
The only mitigations available today are operationally fragile:
- Mount one PVC at the root and use
subPath overlays to project a second PVC underneath it. The lifecycle of the subPath mount is tied to the parent mount, and a kubectl delete pvc on the parent silently wipes everything not actually backed by the second PVC unless the operator is very careful.
- Run secret-bearing storage out-of-process (separate node, separate runtime), losing all the in-process sharing the rest of the stack provides.
- Fork
commonware-runtime's Storage impl.
Proposal
Three API shapes for maintainers to weigh in on. Any would work; preference is A.
Shape A — with_storage_root(path) on the runtime context
Add a per-context override that propagates into the next storage backend it constructs:
let context = Runner::new(cfg).start(|context| async move {
// Default root from Config — used by everything not overridden.
let qmdb = open_qmdb(context.with_label("qmdb")).await?;
let journal = open_journal(context.with_label("journal")).await?;
// Override for DKG share material onto a separate PVC.
let dkg_ctx = context
.with_storage_root("/var/lib/dkg-protected")
.with_label("dkg");
let dkg = open_dkg(dkg_ctx).await?;
});
- Pros: mirrors existing
with_label; minimal caller change; composable per call site.
- Cons: requires
TokioStorage to support multiple live roots; needs a clear story for what happens if two contexts with different roots open partitions of the same name (suggested: roots are independent, partition-name uniqueness is per-root).
Shape B — explicit per-partition root mapping in Config
let cfg = tokio::Config::new()
.with_storage_directory("/var/lib/state")
.with_partition_root_override("dkg_states", "/var/lib/dkg-protected/dkg_states")
.with_partition_root_override("dkg_msgs", "/var/lib/dkg-protected/dkg_msgs");
- Pros: fully explicit; auditable in config; no new context machinery.
- Cons: caller must enumerate every partition name; brittle when upstream renames or adds prefixes; doesn't compose with libraries that own their partition names internally.
Shape C — separate Storage impls under one runtime
Allow constructing a MeteredStorage<TokioStorage> (or equivalent) directly from a path, decoupled from Config::storage_directory, and pass it into individual subsystems via a per-init context wrapper.
- Pros: maximally flexible; cleanest separation of concerns; opens the door to mixed backends (e.g. tokio fs for one root, iouring for another).
- Cons: more invasive; touches the runtime/storage boundary.
Recommended sketch (Shape A)
// runtime/src/tokio/context.rs (illustrative)
impl Context {
/// Override the storage root for any Storage handle subsequently obtained
/// from this context. Inherits from `Config::storage_directory` if unset.
pub fn with_storage_root(mut self, root: impl Into<PathBuf>) -> Self {
self.storage_root_override = Some(root.into());
self
}
}
// runtime/src/storage/tokio/mod.rs
impl Storage {
fn root_for(&self, ctx: &Context) -> &Path {
ctx.storage_root_override
.as_deref()
.unwrap_or(&self.cfg.storage_directory)
}
}
Path resolution inside open_versioned/scan/remove becomes self.root_for(ctx).join(partition).join(hex(name)) instead of self.cfg.storage_directory.join(...). validate_partition_name is unchanged — the override path is provided by the operator/application, not embedded in a partition name.
Backwards compatibility
- Default behavior unchanged: with no
with_storage_root call (Shape A) or no with_partition_root_override entries (Shape B), every partition resolves under Config::storage_directory exactly as today.
validate_partition_name keeps its current rules.
- New methods are additive; existing call sites compile and behave identically.
- No on-disk format change — partitions written under an overridden root are byte-identical to partitions written under the default root, just at a different absolute path.
Open questions
- Is per-context (A) or per-config (B) the better fit for the runtime's existing extension idioms?
with_label precedent leans A; explicit-config precedent leans B.
- Should an overridden root be required to exist + be writable at override time, or lazily on first partition open?
- Should there be a
storage_root_aliases list that surfaces in metrics / health endpoints so operators can see which roots are live?
- Is there appetite for Shape C as a longer-term direction even if A/B lands first?
Happy to send a PR for whichever shape is preferred.
Files cited (v2026.4.0)
runtime/src/tokio/runtime.rs — Config definition and Runner::start plumbing
runtime/src/storage/tokio/mod.rs — Config { storage_directory, maximum_buffer_size } + path resolution
runtime/src/storage/iouring.rs:115,185,205 — same path resolution mirrored for the iouring backend
runtime/src/storage/mod.rs:181-194 — validate_partition_name (rejects /)
Prior art
Searched commonwarexyz/monorepo issues and PRs for storage_directory, partition root, multiple storage, secure storage, PVC. No existing thread proposes per-partition / per-init storage root override. Closest related work, none of which addresses this:
Problem
commonware-runtime'stokio::Configexposes a single, immutable-after-constructionstorage_directory, and every Storage backend resolves all paths underneath it. Concretely:runtime/src/tokio/runtime.rs—Config { storage_directory: PathBuf, .. }withwith_storage_directory(p)andstorage_directory()accessors. The field is consumed whenStorageis started (runtime.rs:386-392passesself.cfg.storage_directory.clone()intoIoUringConfig). Afterstart(), the root is fixed for the lifetime of the runtime.runtime/src/storage/tokio/mod.rs—Config { storage_directory: PathBuf, maximum_buffer_size: usize }. Everyopen_versioned/scan/removeresolves throughself.cfg.storage_directory.join(partition).join(hex(name))(mirrored inruntime/src/storage/iouring.rsat:115foropen_versionedand:185/:205forremove/scan).runtime/src/storage/mod.rs:181-194—validate_partition_namerejects any character that isn't alphanumeric,-, or_. So a subdirectory hop (/) cannot be smuggled inside a partition name to redirect a particular partition to a different root.Runner::startbuilds and owns a tokio runtime. CallingRunner::new(cfg).start(..)twice in one process — once per intended root — is invalid inside an already-running runtime.Net result: every
Storage-backed partition the application opens (block journals, archives, marshal metadata, consensus journals, and DKG share material) MUST live under one filesystem root. There is no API to peel off a single high-trust partition onto a separate volume.Use case
Production consensus operators have different durability classes for different state:
Today these must share one filesystem root. A maintenance operation on the recyclable volume — a Kubernetes
kubectl delete pvc, a snapshot-roll, an ops-driven re-init — wipes the secret material as collateral damage. In a BLS threshold cluster withnvalidators and thresholdt = 2f+1, two such collateral-damage events withinfof each other crosses the BFT bound and wedges consensus until a fresh DKG ceremony is run.The only mitigations available today are operationally fragile:
subPathoverlays to project a second PVC underneath it. The lifecycle of the subPath mount is tied to the parent mount, and akubectl delete pvcon the parent silently wipes everything not actually backed by the second PVC unless the operator is very careful.commonware-runtime'sStorageimpl.Proposal
Three API shapes for maintainers to weigh in on. Any would work; preference is A.
Shape A —
with_storage_root(path)on the runtime contextAdd a per-context override that propagates into the next storage backend it constructs:
with_label; minimal caller change; composable per call site.TokioStorageto support multiple live roots; needs a clear story for what happens if two contexts with different roots open partitions of the same name (suggested: roots are independent, partition-name uniqueness is per-root).Shape B — explicit per-partition root mapping in
ConfigShape C — separate
Storageimpls under one runtimeAllow constructing a
MeteredStorage<TokioStorage>(or equivalent) directly from a path, decoupled fromConfig::storage_directory, and pass it into individual subsystems via a per-init context wrapper.Recommended sketch (Shape A)
Path resolution inside
open_versioned/scan/removebecomesself.root_for(ctx).join(partition).join(hex(name))instead ofself.cfg.storage_directory.join(...).validate_partition_nameis unchanged — the override path is provided by the operator/application, not embedded in a partition name.Backwards compatibility
with_storage_rootcall (Shape A) or nowith_partition_root_overrideentries (Shape B), every partition resolves underConfig::storage_directoryexactly as today.validate_partition_namekeeps its current rules.Open questions
with_labelprecedent leans A; explicit-config precedent leans B.storage_root_aliaseslist that surfaces in metrics / health endpoints so operators can see which roots are live?Happy to send a PR for whichever shape is preferred.
Files cited (v2026.4.0)
runtime/src/tokio/runtime.rs—Configdefinition andRunner::startplumbingruntime/src/storage/tokio/mod.rs—Config { storage_directory, maximum_buffer_size }+ path resolutionruntime/src/storage/iouring.rs:115,185,205— same path resolution mirrored for the iouring backendruntime/src/storage/mod.rs:181-194—validate_partition_name(rejects/)Prior art
Searched
commonwarexyz/monorepoissues and PRs forstorage_directory,partition root,multiple storage,secure storage,PVC. No existing thread proposes per-partition / per-init storage root override. Closest related work, none of which addresses this:Storageto ensure blobs are durably open/removed #1990 — durably-open/remove semantics onStorage; orthogonal.StorageandBlob#148 — originalStorage+Blobtrait introduction; established the singlestorage_directorymodel this issue proposes to extend.destroyto more storage classes; orthogonal.single_issuerinstorage::iouringandnetwork::iouring#1037 —single_issuerenforcement; orthogonal.