Skip to content

[executor] Add .mirrored() to ExecutorStep and InputName#4224

Open
claude[bot] wants to merge 8 commits intomainfrom
agent/20260328-fix-4223
Open

[executor] Add .mirrored() to ExecutorStep and InputName#4224
claude[bot] wants to merge 8 commits intomainfrom
agent/20260328-fix-4223

Conversation

@claude
Copy link
Copy Markdown
Contributor

@claude claude bot commented Mar 28, 2026

Add mirror_budget_gb field to InputName and .mirrored(budget_gb) methods to both ExecutorStep and InputName, so callers can write default_eval(step=training_run.mirrored()) instead of manually resolving paths and prepending mirror://. The budget propagates through cd() and nonblocking() chains, does not affect version hashing (consistent with MirroredValue), and is picked up by _max_mirror_budget. Adds seven tests covering the new methods, chaining, instantiation, and version stability.

Fixes #4223

Add mirror_budget_gb field to InputName and .mirrored(budget_gb) methods
to both ExecutorStep and InputName, so callers can request mirror://
resolution without manually wrapping in MirroredValue. The budget is
preserved through cd() and nonblocking() chains, does not affect version
hashing, and is picked up by _max_mirror_budget.

Fixes #4223

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f254bcec72

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1175 to +1176
if obj.mirror_budget_gb is not None and not resolved.startswith("mirror://"):
return f"mirror://{resolved}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Normalize mirrored InputName paths before prefixing protocol

When InputName.mirror_budget_gb is set, this wraps the fully resolved dependency path (e.g. gs://.../train-hash) as mirror://{resolved}. MirrorFileSystem treats the mirror path as a relative key and prepends regional prefixes (_local_url/_remote_url in lib/iris/src/iris/marin_fs.py), so passing a full URL produces malformed lookups like gs://local/gs://... and the file is reported missing. This means the new ExecutorStep.mirrored()/InputName.mirrored() flow fails for normal absolute executor output paths.

Useful? React with 👍 / 👎.

@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

Hmmm, @claude upon closer inspection I made a bad design decision here. I think we just want to make it so that .mirrored works similar to as_input_name. Call it as_mirrored_value even.

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 6m 32s —— View job


Redesigned .mirrored().as_mirrored_value() to reuse MirroredValue wrapper

  • Read current implementation and understand as_input_name pattern
  • Redesign: remove mirror_budget_gb from InputName, add as_mirrored_value()
  • Update tests
  • Run pre-commit and tests
  • Commit and push

What changed:

Instead of adding mirror_budget_gb to InputName (mixing concerns), as_mirrored_value() now returns MirroredValue(value=InputName(...)) — reusing the existing MirroredValue wrapper pattern, just like as_input_name() wraps a step in InputName.

This works because collect_dependencies_and_version already unwraps MirroredValue transparently (line 1068), and instantiate_config already recurses into MirroredValue.value and adds the mirror:// prefix. No new special-casing needed.

New API:

# Step root
step.as_mirrored_value(budget_gb=25)

# With subdirectory (build path first, mirror last)
(step.as_input_name() / "hf" / "checkpoint").as_mirrored_value(budget_gb=25)

# Using free function
mirrored(step.as_input_name() / "hf", budget_gb=25)

This also addresses the Codex review concern about malformed mirror:// paths — MirroredValue already handles the prefix correctly via instantiate_config.

Testing: All 25 tests pass (1 skipped for unrelated reasons), pre-commit clean. Tests cover: ExecutorStep.as_mirrored_value(), InputName.as_mirrored_value(), cd-then-wrap chaining, the free mirrored() function with InputName, instantiate_config resolution, subdir resolution, and version hash stability.
·

…apper

Instead of adding mirror_budget_gb to InputName (mixing concerns),
as_mirrored_value() returns MirroredValue(value=InputName(...)) which
reuses the existing MirroredValue handling in collect_dependencies_and_version
and instantiate_config. This also fixes the codex review concern about
malformed mirror:// paths since MirroredValue already handles the prefix
correctly.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

@claude can you specifically update ref_hparam_sweep to use this mirrored value so it would be distributed in Iris as an example?

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 4m 53s —— View job


Updated reference_hyperparameter_sweep to use as_mirrored_value()

  • Find and read ref_hparam_sweep experiment file
  • Read current as_mirrored_value() API on this branch
  • Update sweep to use mirrored values for Iris distribution
  • Run pre-commit and tests
  • Commit and push

Changes made to experiments/references/reference_hyperparameter_sweep.py:

  1. Imported MirroredValue from the executor module
  2. Updated VizierUpdateConfig.run_paths type to list[str | MirroredValue]
  3. Changed _build_update_step to use step.as_mirrored_value() instead of step.as_input_name() for training step outputs

This means when the update step reads training run metrics cross-region, the paths will be mirrored via mirror:// — Iris will handle caching the checkpoint data locally before the update step reads it.

# Before
run_paths=[step.as_input_name() for step in training_steps],

# After
run_paths=[step.as_mirrored_value() for step in training_steps],

Testing: Pre-commit clean, all 25 executor tests pass (1 skipped for unrelated reasons).

…stribution

Update the reference sweep to mirror training step outputs when reading them
cross-region in the update step, demonstrating the new as_mirrored_value() API.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

@claude You'll need to mirror the outputs of the update step as well?

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 4m 22s —— View job


  • Read current state of reference_hyperparameter_sweep.py and understand update step outputs
  • Read executor API to understand how to mirror step outputs
  • Update the update step outputs to be mirrored when consumed downstream
  • Run pre-commit and tests
  • Commit and push

Mirrored all cross-step DB paths for Iris distribution

Updated all Vizier DB path references to use as_mirrored_value():

  1. Suggest step DB → update step: (suggest_step / VIZIER_DB_FILENAME).as_mirrored_value()
  2. Update step DB → next loop's suggest step: (previous_update_step / VIZIER_DB_FILENAME).as_mirrored_value()
  3. Update step DB → optimal step: (previous_update_step / VIZIER_DB_FILENAME).as_mirrored_value()

Also updated all input_db_path type annotations across config dataclasses and builder functions to accept str | MirroredValue | None.

Testing: Pre-commit clean, 25 executor tests pass (1 skipped for unrelated reasons).

View job

…ibution

Wrap all Vizier DB path references (suggest step DB → update step,
update step DB → next suggest step, update step DB → optimal step)
with as_mirrored_value() so Iris caches them locally before reads.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

@claude Can you make MirroredValue support truediv and cd()? That way it serves as a more drop in replacement for InputNames and ExecutorSteps when referring to files?

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 3m 25s —— View job


Add __truediv__ and cd() to MirroredValue

  • Read current MirroredValue, InputName, and ExecutorStep implementations
  • Implement __truediv__ and cd() on MirroredValue
  • Add tests for new methods
  • Run pre-commit and tests
  • Commit and push

Added cd() and __truediv__ (i.e. /) to MirroredValue so it works as a drop-in replacement for InputName/ExecutorStep when navigating paths:

# Now works — navigates into subdirs while preserving the mirror wrapper
step.as_mirrored_value(budget_gb=25) / "hf" / "checkpoint"
step.as_mirrored_value().cd("hf").cd("checkpoint")

# Also works with string-wrapped MirroredValues
MirroredValue(value="gs://bucket/path", budget_gb=10) / "subdir"

The budget_gb is preserved through chaining. Supports inner types ExecutorStep, InputName, and str.

Testing: Pre-commit clean, 29 tests pass (1 skipped for unrelated reasons). Added 4 new tests covering: __truediv__ chaining, cd() chaining, instantiate_config resolution with / subdirs, and string value navigation.

View job

MirroredValue now supports / and cd() so callers can navigate into
subdirectories without unwrapping and re-wrapping:
  step.as_mirrored_value() / "hf" / "checkpoint"

The budget_gb is preserved through chaining. Works with inner types
ExecutorStep, InputName, and str.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

@claude I think you lost context here. Can you update the example (ref_hparam_sweep) based on the new changes to support truediv

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 3m 15s —— View job


Update ref_hparam_sweep to use MirroredValue.__truediv__

  • Read current state of reference_hyperparameter_sweep.py
  • Read MirroredValue implementation for __truediv__/cd()
  • Update sweep to use / operator on mirrored values
  • Run pre-commit and tests
  • Commit and push

Updated all three mirrored path expressions in reference_hyperparameter_sweep.py to use MirroredValue.__truediv__ instead of wrapping after navigation:

# Before (navigate then mirror)
(previous_update_step / VIZIER_DB_FILENAME).as_mirrored_value()
(suggest_step / VIZIER_DB_FILENAME).as_mirrored_value()

# After (mirror then navigate — uses MirroredValue.__truediv__)
previous_update_step.as_mirrored_value() / VIZIER_DB_FILENAME
suggest_step.as_mirrored_value() / VIZIER_DB_FILENAME

Testing: Pre-commit clean, 29 executor tests pass (1 skipped for unrelated reasons).

View job

Simplify mirrored path expressions by applying .as_mirrored_value() first
then using / to navigate into subdirectories, demonstrating the new
MirroredValue.__truediv__ support as a drop-in replacement.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Helw150
Copy link
Copy Markdown
Member

Helw150 commented Mar 28, 2026

@claude

  • We only need to keep test_mirrored_input_name_instantiate_config, test_mirrored_value_truediv_instantiate, test_mirrored_input_name_does_not_affect_version. The rests are a bit silly.
  • The hparam sweep doesn't need all those | on the typing. The file is self contained so doesn't need to maintain backward compatibility.

@claude
Copy link
Copy Markdown
Contributor Author

claude bot commented Mar 28, 2026

Claude finished @Helw150's task in 4m 17s —— View job


Cleaned up tests and types

  • Read current test file and identify tests to remove
  • Read hparam sweep file and identify unnecessary union types
  • Remove excess tests (keep only 3 specified)
  • Clean up | MirroredValue union types in hparam sweep
  • Run pre-commit and tests
  • Commit and push

Changes:

  1. Tests — Removed 10 redundant mirrored tests, keeping only the 3 requested:

    • test_mirrored_input_name_instantiate_config
    • test_mirrored_value_truediv_instantiate
    • test_mirrored_input_name_does_not_affect_version
  2. Hparam sweep — Removed all | MirroredValue from type annotations (VizierSuggestConfig, VizierUpdateConfig, VizierOptimalConfig, and builder function signatures) and the now-unused MirroredValue import.

Testing: Pre-commit clean, 17 tests pass (1 skipped for unrelated reasons).

View job

Remove 10 redundant mirrored tests, keeping only the 3 meaningful ones:
instantiate_config, truediv_instantiate, and version stability.
Remove unnecessary | MirroredValue union types from hparam sweep
since the file is self-contained and doesn't need backward compat.

Co-authored-by: William Held <Helw150@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Helw150 Helw150 requested a review from rjpower March 28, 2026 07:47
Copy link
Copy Markdown
Collaborator

@rjpower rjpower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should be a bit less magical in our handling of these things like path-likes, but this is consistent with the existing work so I won't complain.

(I really feel outputs should be value with types on them instead of raw paths, but... that's for another time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add .mirrored() to ExecutorStep and InputName for cross-region reads

2 participants