Make worker installs reproducible with iii.lock#1527
Conversation
iii-worker can parse registry graph responses, write deterministic lockfiles, verify config against the lock, and expose sync/update/verify commands. The add path tries the graph resolver first while retaining the older single-worker response fallback. Constraint: Preserve existing iii worker add behavior for old registries and built-ins Rejected: Store lock metadata in config.yaml | mixes generated pins with user config Rejected: Remove /download fallback | would break current registry fixtures and older APIs Confidence: medium Scope-risk: moderate Directive: Keep stdout machine-readable; human status stays on stderr Tested: cargo test -p iii-worker --no-fail-fast; cargo fmt --all; git -C ../iii diff --check Not-tested: cargo clippy -p iii-worker --lib -- -D warnings due pre-existing lint debt in local_worker.rs, managed.rs, vm_boot.rs, worker_manager/libkrun.rs
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a YAML-backed Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as "iii (CLI)"
participant FS as "Filesystem"
participant Registry as "Registry API"
participant Sources as "Binary/Image Sources"
User->>CLI: iii worker update [<name>]
CLI->>FS: read `config.yaml`
CLI->>Registry: fetch_resolved_worker_graph(name?, version, target)
Registry-->>CLI: ResolvedWorkerGraph
CLI->>Sources: fetch artifacts (binaries / images)
Sources-->>CLI: artifacts (url, sha256 / image ref)
CLI->>CLI: build `WorkerLockfile` from graph
CLI->>FS: read/merge/write `iii.lock`
User->>CLI: iii worker sync [--frozen]
CLI->>FS: read `iii.lock`
alt frozen = true
CLI->>FS: read `config.yaml`
CLI->>CLI: verify_config_workers()
else
CLI->>Sources: (optionally) ensure artifacts installed per lockfile
end
User->>CLI: iii worker verify
CLI->>FS: read `config.yaml` & `iii.lock`
CLI->>CLI: verify_config_workers()
CLI-->>User: verification result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/iii-worker/src/cli/lockfile.rs`:
- Around line 62-64: The from_yaml function currently accepts any schema; after
deserializing into Self, validate the lockfile version field and reject
unsupported versions instead of silently accepting them. Change from_yaml to
parse into Self, check self.version (or the version field on the deserialized
struct) equals the supported value (e.g. 1), and return Err(format!("unsupported
{LOCKFILE_NAME} version: {}", self.version)) if not; keep using LOCKFILE_NAME
and ensure callers of from_yaml (e.g., sync/verify) only operate on validated
lockfiles.
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 222-225: The current flow calls
super::config_file::list_worker_names() which silently returns an empty Vec on
read errors, allowing lockfile.verify_config_workers(&names) to erroneously
succeed; change the call to a fallible API (e.g., add/use a read_result variant
or make list_worker_names return Result<Vec<_>, Error>) and handle the error
before calling lockfile.verify_config_workers: if reading config.yaml fails,
surface the error (e.g., eprintln! with context including "config.yaml" and the
error) and exit/return non‑zero instead of proceeding to call
verify_config_workers; update the branch that currently prints "config.yaml
matches iii.lock" to only run when the config read succeeded and
verify_config_workers returned Ok.
- Around line 587-605: The current match on fetch_resolved_worker_graph only
falls back when the graph parse fails, but it should also fall back when the
registry doesn't support /resolve (e.g., 404 or 405 responses) instead of
exiting; update the Err arm(s) in the match around fetch_resolved_worker_graph
so that errors indicating an unsupported resolve endpoint (HTTP 404/405 or
equivalent messages) are treated like the parse-failure case and run the same
fallback logic (call handle_resolved_graph_add(&graph, brief) flow or continue
to the existing /download path) rather than printing the error and returning 1;
refer to fetch_resolved_worker_graph, handle_resolved_graph_add, and finish_add
to locate the code to change.
- Around line 367-409: The loop in handle_resolved_graph_add calls handlers with
node.name without validating it; add a validation step to prevent invalid worker
names from causing install/config side effects. Before performing any installs,
validate each node.name (e.g., non-empty and matches your worker-name rules or
reuse an existing validator) and return a non-zero error with a clear message if
any name is invalid; alternately, perform a pre-check pass over graph to
validate all names before the main install loop. Update
handle_resolved_graph_add to reference node.name and use the validation result
to short-circuit before calling handle_binary_add or handle_oci_pull_and_add.
- Around line 188-210: handle_worker_sync currently only reads the lockfile and
prints the count instead of performing the non-frozen sync; replace the
"count-only" branch so that after reading the lockfile
(WorkerLockfile::read_from(lock_path)) you iterate lockfile.workers and perform
the lockfile-driven install/update workflow (invoke the existing install/pull
routines for each worker entry — e.g., the function(s) responsible for
installing binaries or images), apply any necessary updates to config.yaml,
surface errors (return non-zero on failure) and log progress, while keeping the
frozen path that delegates to handle_worker_verify() intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0251bb52-439f-4d81-ad26-d5f6a7fbfe60
📒 Files selected for processing (7)
crates/iii-worker/src/cli/app.rscrates/iii-worker/src/cli/lockfile.rscrates/iii-worker/src/cli/managed.rscrates/iii-worker/src/cli/mod.rscrates/iii-worker/src/cli/registry.rscrates/iii-worker/src/main.rscrates/iii-worker/tests/worker_integration.rs
Add regression coverage for paths exercised by the new `sync`, `verify`, and `update` commands: lockfile round-trip through disk, verify_config symmetry, lockfile_from_graph error branches (missing binaries, wrong target, missing image, unsupported type), binary_config_yaml(null), and handler return codes when iii.lock is absent or the named worker is not pinned.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/iii-worker/src/cli/lockfile.rs (1)
62-64:⚠️ Potential issue | 🟡 MinorReject unsupported
iii.lockversions on read.
from_yamlstill accepts anyversion, so future/incompatible schemas can be silently treated as v1 bysync/verify.🛡️ Proposed validation
pub fn from_yaml(input: &str) -> Result<Self, String> { - serde_yaml::from_str(input).map_err(|e| format!("failed to parse {LOCKFILE_NAME}: {e}")) + let lockfile: Self = + serde_yaml::from_str(input).map_err(|e| format!("failed to parse {LOCKFILE_NAME}: {e}"))?; + if lockfile.version != LOCKFILE_VERSION { + return Err(format!( + "unsupported {LOCKFILE_NAME} version {} (expected {})", + lockfile.version, LOCKFILE_VERSION + )); + } + Ok(lockfile) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/lockfile.rs` around lines 62 - 64, The from_yaml function currently accepts any lockfile version, risking silently treating future/incompatible schemas as v1; update the from_yaml implementation to deserialize into the lockfile struct (using serde_yaml::from_str) then explicitly check the deserialized.version field against the supported version constant (or allowed set) and return Err(format!("unsupported {LOCKFILE_NAME} version: {}", version)) when it doesn't match; make this change inside from_yaml (and keep the existing error wrapping for parse failures) so callers (e.g., sync/verify) only operate on supported lockfile versions.
🧹 Nitpick comments (1)
crates/iii-worker/src/cli/lockfile.rs (1)
78-80: Writeiii.lockvia a temporary file before replacing it.
std::fs::writetruncates the destination first; a crash or full disk can leaveiii.lockpartially written and break subsequentsync/verifyruns. Prefer writing beside the lockfile and renaming into place.♻️ Safer replacement pattern
pub fn write_to(&self, path: &Path) -> Result<(), String> { let yaml = self.to_yaml()?; - std::fs::write(path, yaml).map_err(|e| format!("failed to write {}: {e}", path.display())) + let tmp_path = path.with_extension("lock.tmp"); + std::fs::write(&tmp_path, yaml) + .map_err(|e| format!("failed to write {}: {e}", tmp_path.display()))?; + std::fs::rename(&tmp_path, path).map_err(|e| { + let _ = std::fs::remove_file(&tmp_path); + format!("failed to replace {}: {e}", path.display()) + }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/lockfile.rs` around lines 78 - 80, The write_to method currently uses std::fs::write which truncates the destination; change it to write the YAML to a temporary file next to the target (e.g., create a temp file in the same directory using a unique extension or NamedTempFile), flush and sync the temp file to disk, then atomically rename (std::fs::rename) the temp file over the target path to replace iii.lock; ensure errors from creating/writing/syncing/renaming are propagated as the existing Result<String, ()> pattern so callers of write_to see the failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/iii-worker/src/cli/lockfile.rs`:
- Around line 62-64: The from_yaml function currently accepts any lockfile
version, risking silently treating future/incompatible schemas as v1; update the
from_yaml implementation to deserialize into the lockfile struct (using
serde_yaml::from_str) then explicitly check the deserialized.version field
against the supported version constant (or allowed set) and return
Err(format!("unsupported {LOCKFILE_NAME} version: {}", version)) when it doesn't
match; make this change inside from_yaml (and keep the existing error wrapping
for parse failures) so callers (e.g., sync/verify) only operate on supported
lockfile versions.
---
Nitpick comments:
In `@crates/iii-worker/src/cli/lockfile.rs`:
- Around line 78-80: The write_to method currently uses std::fs::write which
truncates the destination; change it to write the YAML to a temporary file next
to the target (e.g., create a temp file in the same directory using a unique
extension or NamedTempFile), flush and sync the temp file to disk, then
atomically rename (std::fs::rename) the temp file over the target path to
replace iii.lock; ensure errors from creating/writing/syncing/renaming are
propagated as the existing Result<String, ()> pattern so callers of write_to see
the failure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 41b95339-d294-4140-8d77-416e7f3efe40
📒 Files selected for processing (2)
crates/iii-worker/src/cli/lockfile.rscrates/iii-worker/src/cli/managed.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/iii-worker/src/cli/managed.rs
PR #1527 adds lockfile-backed managed worker resolution, so the docs now need a workflow page and reference entry that describe the current user-facing behavior accurately, including the intentionally limited sync behavior. Constraint: Docs must reflect the current PR implementation where non-frozen sync reads and reports the lockfile but does not install artifacts. Rejected: Present iii worker sync as a full installer | current handler does not install from the lockfile yet Confidence: high Scope-risk: narrow Directive: Keep sync wording aligned with CLI behavior before expanding the docs to an install workflow Tested: docs/docs.json parse; docs navigation target check; changed-doc link check; git diff --check; Mintlify preview at http://localhost:3002 returned 200 for both new pages Not-tested: pnpm fmt:check due missing node_modules/local biome; cargo fmt due pre-existing Rust formatting diff in PR code
The PR branch needed to absorb main after the changelog gained the new iii worker exec entry. The conflict resolution keeps main's exec note and the PR's reproducible worker install note together in the 0.11.0 changelog. Constraint: Preserve main's merged changelog updates while keeping the iii.lock documentation entry from this PR. Rejected: Drop either changelog entry | both describe distinct user-facing worker changes Confidence: high Scope-risk: moderate Directive: Keep future changelog edits additive when multiple worker features land in the same release section Tested: git diff --cached --check; cargo fmt --all -- --check; no unmerged files; conflict-marker search; docs/docs.json navigation target check; changed-doc link check Not-tested: Full repo test suite after merging main
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/iii-worker/src/cli/managed.rs (1)
142-154: Silent fallback onserde_yamlserialization failure.
unwrap_or_default()turns any serialization error into an empty string and then intoNone, so a malformed registry config is silently dropped as "no config" instead of surfacing. Serialization failures on JSON→YAML are vanishingly rare in practice, but since this function is used for the config block that ends up inconfig.yaml, propagating the error (or at leasttracing::warn!ing it) would prevent a confusing "worker installed but config section missing" outcome.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 142 - 154, The current use of serde_yaml::to_string(config).unwrap_or_default() swallows serialization errors; change it to explicitly handle the Result: call serde_yaml::to_string(config) and match on Ok(yaml_str) to continue with the existing strip_prefix/trim_end/empty checks, and on Err(e) call tracing::warn!(%e, ?config, "failed to serialize registry config to YAML") (or propagate the error if the surrounding function returns a Result) and then return None; this uses the same symbols (serde_yaml::to_string, tracing::warn!, strip_prefix) so you can locate and update the block that produces the YAML string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 235-287: handle_worker_update currently re-resolves every entry in
lockfile.workers (names) including transitive deps and blindly inserts merged
nodes via handle_resolved_graph_add, which fails to prune removed nodes and
allows overlapping re-resolves to overwrite correct pins; fix by (1) changing
the None branch that builds names to only include top-level root workers from
your project config (parse config.yaml to get the true roots instead of using
lockfile.workers.keys()), (2) when processing each resolved graph returned by
fetch_resolved_worker_graph(name, ...) compute the set of nodes in that graph
and diff against lockfile.workers before calling handle_resolved_graph_add so
you remove any lockfile entries that are no longer referenced by that root, and
(3) update handle_resolved_graph_add (or replace its merge behavior) to perform
a replace-with-diff for the root’s subtree instead of unconditional
lockfile.workers.insert(...) so transitive dependencies aren’t re-pinned by
separate resolutions.
---
Nitpick comments:
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 142-154: The current use of
serde_yaml::to_string(config).unwrap_or_default() swallows serialization errors;
change it to explicitly handle the Result: call serde_yaml::to_string(config)
and match on Ok(yaml_str) to continue with the existing
strip_prefix/trim_end/empty checks, and on Err(e) call tracing::warn!(%e,
?config, "failed to serialize registry config to YAML") (or propagate the error
if the surrounding function returns a Result) and then return None; this uses
the same symbols (serde_yaml::to_string, tracing::warn!, strip_prefix) so you
can locate and update the block that produces the YAML string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 20849afa-1f0e-452f-940e-a2758e4de63e
📒 Files selected for processing (8)
crates/iii-worker/src/cli/app.rscrates/iii-worker/src/cli/managed.rscrates/iii-worker/src/cli/mod.rscrates/iii-worker/src/main.rscrates/iii-worker/tests/worker_integration.rsdocs/changelog.mdxdocs/docs.jsondocs/how-to/managing-container-workers.mdx
✅ Files skipped from review due to trivial changes (3)
- crates/iii-worker/src/cli/mod.rs
- docs/docs.json
- docs/changelog.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/how-to/managing-container-workers.mdx
- crates/iii-worker/tests/worker_integration.rs
The lockfile flow now matches the supported worker taxonomy: frozen verification checks registry-managed workers without rejecting built-ins, direct OCI refs, or local-path sandbox workers. Legacy registries that do not expose /resolve continue through the existing /download path, and bulk update skips dependency-only pins instead of treating every locked entry as a root. Constraint: Direct OCI refs, built-ins, and local-path sandbox workers remain outside the current iii.lock format. Constraint: iii.lock has no explicit roots field, so root updates are inferred from dependency edges for this PR. Rejected: Require every config.yaml worker to be locked | breaks valid mixed worker projects and contradicts the documented lockfile scope. Rejected: Add a lockfile roots schema field | broader public schema change than needed for this PR. Confidence: high Scope-risk: moderate Directive: Do not broaden frozen verification without adding a config-level marker or lockfile schema support for unmanaged worker kinds. Tested: cargo fmt --all Tested: cargo fmt --all -- --check Tested: cargo test -p iii-worker managed::tests -- --nocapture Tested: cargo test -p iii-worker cli::registry::tests -- --nocapture Tested: cargo test -p iii-worker sync_help_matches_current_read_only_behavior -- --nocapture Tested: docs/docs.json parsed with node Tested: git diff --check Not-tested: Full cargo test -p iii-worker completed; worker_list_discovery_integration hung in a pre-existing copied /bin/sleep sentinel and the run was stopped after all affected suites had passed. Not-tested: pnpm fmt:check completed; Biome exits on existing nested root configuration in console/packages/console-frontend/biome.json and sdk/biome.json.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
crates/iii-worker/src/cli/managed.rs (4)
415-457:⚠️ Potential issue | 🟠 MajorResolved graph node names are still not validated before install/config side effects.
node.namecomes straight from the registry payload and is passed intohandle_binary_add/handle_oci_pull_and_app/append_workerwithoutvalidate_worker_name. A hostile (or broken) registry can inject names with.., path separators, or YAML-breaking characters and cause filesystem writes andconfig.yamledits before the rest of the pipeline notices.🛡️ Pre-validate all nodes before any install side effect
async fn handle_resolved_graph_add(graph: &ResolvedWorkerGraph, brief: bool) -> i32 { let target = graph .target .clone() .unwrap_or_else(|| binary_download::current_target().to_string()); + for node in &graph.graph { + if let Err(e) = super::registry::validate_worker_name(&node.name) { + eprintln!( + "{} invalid resolved worker '{}': {}", + "error:".red(), + node.name, + e + ); + return 1; + } + } + for node in &graph.graph {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 415 - 457, The loop in handle_resolved_graph_add performs installs/config changes using node.name without validation; before doing any side effects, iterate over graph.graph and call validate_worker_name for each node.name (and also reject empty/missing names) and return a non-zero error if validation fails so no calls to handle_binary_add, handle_oci_pull_and_add, or append_worker occur; ensure you reference ResolvedWorkerGraph/graph.graph and node.name when adding this pre-check so all nodes are validated up front.
262-314:⚠️ Potential issue | 🟠 Major
worker updatestill never prunes stale lockfile entries.
locked_root_worker_names(lines 316–335) is a good improvement for theNonebranch — it filters out transitive deps before re-resolving, addressing the overlap concern. However, the merge inhandle_resolved_graph_add(lines 467–471) is stillinsert-only: entries that existed iniii.lockbut are no longer in the newly-resolved graph remain indefinitely. IfA@1dropped its dep onBupstream,B's pin lingers forever andverify/sync --frozenkeep honoring a ghost requirement.Consider diffing the resolved subtree for each root against the existing lockfile and removing entries that are no longer reachable from any root's new graph before writing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 262 - 314, handle_worker_update currently re-resolves root graphs via locked_root_worker_names but never prunes stale entries because handle_resolved_graph_add only inserts new pins; to fix, after fetching each resolved graph in handle_worker_update (or inside handle_resolved_graph_add) compute the set of nodes reachable from the newly resolved root graph and then remove any entries from the in-memory WorkerLockfile that are not reachable from any root's new graph before persisting iii.lock; use the existing symbols locked_root_worker_names, fetch_resolved_worker_graph, handle_resolved_graph_add and the WorkerLockfile structure to locate where to diff the resolved subtree against lockfile.workers and delete unreachable keys so removed upstream deps no longer linger.
212-234:⚠️ Potential issue | 🟠 Major
verifycan still report success whenconfig.yamlis unreadable.
list_worker_names()swallows read errors and returns an emptyVec.lockfile_relevant_config_worker_namesthen filters an empty list to an empty list, andverify_config_workers(&[])returnsOk(()), so the command prints "config.yaml matches iii.lock" and exits 0 on an unreadable/invalid config. Use a fallible read API here and fail closed; filtering does not cover this case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 212 - 234, handle_worker_verify currently calls list_worker_names which swallows read errors, causing verify to succeed on unreadable config; change the flow to use a fallible API (or make list_worker_names return Result) so the function returns Err on config read failure: replace the call to list_worker_names() with a fallible reader (e.g., a new list_worker_names_result or a read_config_names that returns Result<Vec<String>, Error>), propagate the error out of handle_worker_verify (printing the error and returning non-zero) before calling lockfile_relevant_config_worker_names and verify_config_workers; keep using lockfile_relevant_config_worker_names(&lockfile, &names) and lockfile.verify_config_workers(&names) after successful read.
188-210:⚠️ Potential issue | 🟠 MajorNon-frozen
worker syncstill only reports — does not install.
handle_worker_syncreadsiii.lockand prints the count; it never installs binaries/images or reconcilesconfig.yaml. A fresh checkout will print "✓ N worker(s) pinned" and exit 0 without producing any runnable state. This matches the "known gap" called out in the PR description but is worth landing before users hit it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 188 - 210, handle_worker_sync currently reads iii.lock (via WorkerLockfile::read_from) and only prints the pinned count (using lockfile.workers.len()) then exits; when frozen is false it should instead perform reconciliation/installation of worker binaries/images and reconcile config.yaml. Replace the printing-only path in handle_worker_sync with a call to the worker reconcile/install routine (implement one if missing, e.g., reconcile_workers_from_lockfile or handle_worker_install that accepts the loaded WorkerLockfile) and propagate its Result/exit code (return non-zero on errors); keep the verify path calling handle_worker_verify() for frozen=true and ensure errors from WorkerLockfile::read_from still produce non-zero exit codes as before.
🧹 Nitpick comments (1)
crates/iii-worker/src/cli/managed.rs (1)
131-156: Unwrap of inner"config"key can surprise users.When the top-level registry
configobject happens to contain a"config"key, this code unwraps to that inner value and drops the rest. That is correct for the documented{name, config}registry shape, but a legitimate user config that defines its ownconfig:subkey (not the wrapper) will be silently rewritten to just that subkey's contents. It is worth a comment here pointing to the exact wrapper shape this is keyed on so future contributors don't broaden it accidentally.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/iii-worker/src/cli/managed.rs` around lines 131 - 156, The function binary_config_yaml currently unwraps a nested "config" key for any top-level JSON object which can mistakenly drop other keys; update binary_config_yaml to only unwrap into the inner "config" when the object matches the expected registry wrapper shape (e.g., contains exactly the keys "name" and "config" or at least both "name" and "config" and no other unknown keys), and add a clear comment above the match explaining this is intentional for the documented {name, config} wrapper shape so future contributors don't broaden the behavior accidentally; reference the function name binary_config_yaml and the "config" key when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 421-484: The loop currently appends workers to config.yaml via
handle_binary_add/handle_oci_pull_and_add but returns early on first failure,
leaving iii.lock out of sync; modify managed.rs to track the names (and
corresponding WorkerLockfile entries) of successfully installed workers during
the for node in &graph.graph loop (e.g., collect into a Vec or a temporary
WorkerLockfile map), and if any handler returns rc != 0 then before returning
that rc write a best-effort lockfile containing only those successfully
installed workers (use the same WorkerLockfile struct read_from/insert/write_to
flow used later) so config.yaml and iii.lock remain consistent; update places
where lockfile_from_graph(graph, &target) is used to still produce the full
lockfile on success but ensure the early-return path uses the collected subset
and calls lockfile.write_to(...) before returning.
- Around line 279-314: After computing names (from worker_name or
locked_root_worker_names), add a check for names.is_empty() and if true print a
clear stderr message like "no workers pinned in iii.lock; nothing to update"
(using eprintln! and the same red prefix style as other errors) and return a
non-zero exit code (e.g., 1) so the command does not silently succeed; place
this check before the for name in &names loop (the surrounding function that
uses fetch_resolved_worker_graph, handle_resolved_graph_add, and fail_count).
---
Duplicate comments:
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 415-457: The loop in handle_resolved_graph_add performs
installs/config changes using node.name without validation; before doing any
side effects, iterate over graph.graph and call validate_worker_name for each
node.name (and also reject empty/missing names) and return a non-zero error if
validation fails so no calls to handle_binary_add, handle_oci_pull_and_add, or
append_worker occur; ensure you reference ResolvedWorkerGraph/graph.graph and
node.name when adding this pre-check so all nodes are validated up front.
- Around line 262-314: handle_worker_update currently re-resolves root graphs
via locked_root_worker_names but never prunes stale entries because
handle_resolved_graph_add only inserts new pins; to fix, after fetching each
resolved graph in handle_worker_update (or inside handle_resolved_graph_add)
compute the set of nodes reachable from the newly resolved root graph and then
remove any entries from the in-memory WorkerLockfile that are not reachable from
any root's new graph before persisting iii.lock; use the existing symbols
locked_root_worker_names, fetch_resolved_worker_graph, handle_resolved_graph_add
and the WorkerLockfile structure to locate where to diff the resolved subtree
against lockfile.workers and delete unreachable keys so removed upstream deps no
longer linger.
- Around line 212-234: handle_worker_verify currently calls list_worker_names
which swallows read errors, causing verify to succeed on unreadable config;
change the flow to use a fallible API (or make list_worker_names return Result)
so the function returns Err on config read failure: replace the call to
list_worker_names() with a fallible reader (e.g., a new list_worker_names_result
or a read_config_names that returns Result<Vec<String>, Error>), propagate the
error out of handle_worker_verify (printing the error and returning non-zero)
before calling lockfile_relevant_config_worker_names and verify_config_workers;
keep using lockfile_relevant_config_worker_names(&lockfile, &names) and
lockfile.verify_config_workers(&names) after successful read.
- Around line 188-210: handle_worker_sync currently reads iii.lock (via
WorkerLockfile::read_from) and only prints the pinned count (using
lockfile.workers.len()) then exits; when frozen is false it should instead
perform reconciliation/installation of worker binaries/images and reconcile
config.yaml. Replace the printing-only path in handle_worker_sync with a call to
the worker reconcile/install routine (implement one if missing, e.g.,
reconcile_workers_from_lockfile or handle_worker_install that accepts the loaded
WorkerLockfile) and propagate its Result/exit code (return non-zero on errors);
keep the verify path calling handle_worker_verify() for frozen=true and ensure
errors from WorkerLockfile::read_from still produce non-zero exit codes as
before.
---
Nitpick comments:
In `@crates/iii-worker/src/cli/managed.rs`:
- Around line 131-156: The function binary_config_yaml currently unwraps a
nested "config" key for any top-level JSON object which can mistakenly drop
other keys; update binary_config_yaml to only unwrap into the inner "config"
when the object matches the expected registry wrapper shape (e.g., contains
exactly the keys "name" and "config" or at least both "name" and "config" and no
other unknown keys), and add a clear comment above the match explaining this is
intentional for the documented {name, config} wrapper shape so future
contributors don't broaden the behavior accidentally; reference the function
name binary_config_yaml and the "config" key when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 05165721-712b-4cf1-ba54-ed66f7b33959
📒 Files selected for processing (7)
crates/iii-worker/src/cli/app.rscrates/iii-worker/src/cli/managed.rscrates/iii-worker/src/cli/registry.rscrates/iii-worker/src/lib.rscrates/iii-worker/tests/worker_integration.rsdocs/how-to/reproduce-worker-installs.mdxdocs/workers/managed-worker-lockfile.mdx
✅ Files skipped from review due to trivial changes (2)
- docs/how-to/reproduce-worker-installs.mdx
- docs/workers/managed-worker-lockfile.mdx
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/iii-worker/src/cli/app.rs
| let names: Vec<String> = match worker_name { | ||
| Some(name) => { | ||
| if !lockfile.workers.contains_key(name) { | ||
| eprintln!("{} Worker '{}' is not in iii.lock", "error:".red(), name); | ||
| return 1; | ||
| } | ||
| vec![name.to_string()] | ||
| } | ||
| None => locked_root_worker_names(&lockfile), | ||
| }; | ||
|
|
||
| let mut fail_count = 0; | ||
| for name in &names { | ||
| let graph = match fetch_resolved_worker_graph( | ||
| name, | ||
| Some("latest"), | ||
| Some(binary_download::current_target()), | ||
| ) | ||
| .await | ||
| { | ||
| Ok(graph) => graph, | ||
| Err(e) => { | ||
| eprintln!("{} {}", "error:".red(), e); | ||
| fail_count += 1; | ||
| continue; | ||
| } | ||
| }; | ||
|
|
||
| let rc = handle_resolved_graph_add(&graph, false).await; | ||
| if rc != 0 { | ||
| fail_count += 1; | ||
| } | ||
| } | ||
|
|
||
| if fail_count == 0 { 0 } else { 1 } | ||
| } |
There was a problem hiding this comment.
Minor: update with no name and an empty lockfile exits 0 silently.
If iii.lock parses but contains zero entries, locked_root_worker_names returns an empty vec, the loop runs zero iterations, and the command reports success with no output. This is very easy to mistake for "updated successfully" in CI logs. Consider emitting a message like "no workers pinned in iii.lock; nothing to update" (or a non-zero exit) to distinguish from the happy path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/iii-worker/src/cli/managed.rs` around lines 279 - 314, After computing
names (from worker_name or locked_root_worker_names), add a check for
names.is_empty() and if true print a clear stderr message like "no workers
pinned in iii.lock; nothing to update" (using eprintln! and the same red prefix
style as other errors) and return a non-zero exit code (e.g., 1) so the command
does not silently succeed; place this check before the for name in &names loop
(the surrounding function that uses fetch_resolved_worker_graph,
handle_resolved_graph_add, and fail_count).
Binary lock entries record a single target triple today, so a lockfile created on macOS could previously satisfy Linux CI as long as the worker name existed. Frozen verification now checks the current target for binary workers and fails when the lock was resolved for another platform. Constraint: iii.lock v1 still stores one binary target per entry. Rejected: Expand the schema to multi-target binary pins | requires resolver/API certainty that every graph response includes all platform artifacts. Confidence: high Scope-risk: narrow Directive: If the resolver starts returning all binary artifacts reliably, prefer evolving iii.lock to multi-target binary source pins instead of weakening this target check. Tested: cargo fmt --all Tested: cargo fmt --all -- --check Tested: cargo test -p iii-worker cli::lockfile::tests -- --nocapture Tested: cargo test -p iii-worker managed::tests -- --nocapture Tested: docs/docs.json parsed with node Tested: git diff --check Not-tested: pnpm fmt:check; still blocked by existing Biome nested-root configuration errors.
Binary lockfile entries now record a sorted artifact map keyed by target triple, so one resolved worker version can carry macOS and Linux binaries in the same iii.lock. The resolver path asks for a platform-neutral graph, while verify checks only that the current target exists in the locked artifact set. Legacy single-target binary source entries are still accepted on read and rewritten into the artifacts form. Constraint: Existing single-target iii.lock files must remain readable. Rejected: Keep one binary target per lock entry | reproduces the macOS-to-Linux CI failure the lockfile is meant to avoid. Confidence: high Scope-risk: moderate Directive: Do not reintroduce target-filtered graph resolution unless the registry has a separate immutable multi-target metadata source. Tested: cargo fmt --all; cargo fmt --all -- --check; cargo test -p iii-worker cli::lockfile::tests -- --nocapture; cargo test -p iii-worker managed::tests -- --nocapture; cargo test -p iii-worker cli::registry::tests -- --nocapture; docs/docs.json parse; git diff --check Not-tested: pnpm fmt:check because pnpm is not available on PATH in this shell.
The lockfile is now validated before read/write callers use it, so unsupported schema versions, invalid binary checksums, unpinned images, invalid worker names, and type/source mismatches do not pass as reproducible pins. Worker verify now uses a fallible config.yaml name reader so parse/read errors cannot turn into empty-config success, and resolved registry graph nodes are prevalidated before any install or config side effects. Constraint: Non-frozen worker sync remains intentionally read-only in this PR. Rejected: Implement full lockfile sync installer here | larger behavior change already documented as follow-up and outside the necessary review fixes. Confidence: high Scope-risk: narrow Directive: Keep verify fail-closed; do not route it through tolerant config helpers that hide read or YAML errors. Tested: cargo fmt --all; cargo fmt --all -- --check; cargo test -p iii-worker cli::lockfile::tests -- --nocapture; cargo test -p iii-worker managed::tests -- --nocapture; cargo test -p iii-worker cli::config_file::tests -- --nocapture; git diff --check Not-tested: pnpm fmt:check because pnpm is not available on PATH in this shell.
Registry graph installs can mutate config.yaml before iii.lock is written. This makes the graph path validate the resolved lockfile and existing lockfile before install, snapshot config.yaml, and restore it on any later graph failure so verify is not left with configured-but-unpinned workers. Empty lockfile updates now say there is nothing to update instead of exiting silently. Constraint: CodeRabbit flagged partial graph installs and empty lockfile updates as actionable review feedback. Rejected: Remove only newly added worker names | unsafe when a graph update touches workers that already existed in config.yaml. Confidence: high Scope-risk: narrow Directive: Keep graph-level side effects fail-closed until sync grows real lockfile-driven install semantics. Tested: cargo fmt --all -- --check Tested: cargo test -p iii-worker managed::tests -- --nocapture Tested: cargo test -p iii-worker handle_resolved_graph_add_rejects_invalid_existing_lockfile_before_side_effects -- --nocapture Tested: cargo test -p iii-worker handle_resolved_graph_add_restores_config_when_later_node_fails -- --nocapture Tested: cargo test -p iii-worker handle_worker_update_reports_empty_lockfile_without_resolving -- --nocapture Tested: git diff --check Not-tested: Full cargo test -p iii-worker, due pre-existing local full-suite hang noted earlier on worker_list_discovery_integration.
Summary
Adds reproducible registry-managed worker installs through
iii.lock, including dependency graph locking, CI verification, update flows, legacy registry fallback, portable binary artifacts across platforms, and fail-closed lockfile validation.This makes registry shorthand installs resolver-first: when the registry returns a worker graph, the CLI installs the whole graph and records deterministic pins for every resolved worker.
What Changed
Lockfile
iii.lockYAML format withversion: 1and sortedworkersentries.version,type,dependencies, and pinnedsource.This is closer to Cargo's lockfile model: the resolved worker version and dependency graph are shared across machines, while each platform selects its matching artifact. Older single-target binary sources using
target,url, andsha256remain readable and are rewritten into theartifactsshape on update.Worker CLI
iii worker add <worker[@version]>tries/resolvefirst and writes or merges lockfile entries when a graph is returned.iii.lock.iii.lockbefore install side effects.config.yamland restore it if a later node, lockfile merge, or lockfile write fails, avoiding configured-but-unpinned partial installs.iii worker update [worker]re-resolveslatestfor one locked worker, or inferred root workers when no worker is supplied.iii worker updatewith an empty lockfile now reports that there is nothing to update instead of exiting silently.iii worker verifychecks lockfile-managed workers fromconfig.yaml, allows extra lockfile entries, skips built-ins/direct OCI/local-path sandbox workers, and validates that locked binary workers include an artifact for the current target.iii worker sync --frozendelegates toverify.iii worker syncremains read-only for now: it reads/parsesiii.lockand reports the pinned worker count./resolvecompatibility failures such as 404/405 fall back to the legacy/downloadendpoint.syncnow matches the current read-only implementation.Docs
iii.lock.Validation
cargo fmt --allcargo fmt --all -- --checkcargo test -p iii-worker cli::lockfile::tests -- --nocapturecargo test -p iii-worker cli::config_file::tests -- --nocapturecargo test -p iii-worker managed::tests -- --nocapturecargo test -p iii-worker cli::registry::tests -- --nocapturecargo test -p iii-worker sync_help_matches_current_read_only_behavior -- --nocapturecargo test -p iii-worker handle_resolved_graph_add_rejects_invalid_existing_lockfile_before_side_effects -- --nocapturecargo test -p iii-worker handle_resolved_graph_add_restores_config_when_later_node_fails -- --nocapturecargo test -p iii-worker handle_worker_update_reports_empty_lockfile_without_resolving -- --nocapturedocs/docs.jsongit diff --checkNotes / Known Gaps
pnpm fmt:checkwas not run in this shell becausepnpmwas not available onPATH. Earlier local attempts also exposed existing Biome nested-root configuration issues unrelated to this PR.cargo test -p iii-workerwas not used as the final verification path because an existing worker-list discovery integration test previously hung locally on a copied/bin/sleepsentinel. The affected lockfile, registry, managed-worker, config-file, and CLI-help suites were run directly.worker syncintentionally remains a read/parse/report command in this PR; it does not install artifacts from the lockfile yet.