Commit 003de39
authored
Remove dead Ray helpers from marin.cluster (#5131)
## Summary
Follow-up cleanup to the Ray-removal effort (parent #4453). Flagged by
#5089's agent during operator-tooling review.
- Delete `lib/marin/src/marin/cluster/config.py` entirely. Its only live
caller was `ray_run.py`, removed in #5087.
- Trim Ray-only helpers from `lib/marin/src/marin/cluster/gcp.py`:
`terminate_tpus_in_cluster`, `terminate_head_node`, `delete_tpu_node`.
- Keep the generic gcloud helpers (`run_gcloud_command`,
`get_project_id`, `get_default_zone`, `list_instances`,
`list_tpu_nodes`, `find_tpu_by_ip`, `find_vm_by_ip`, `ssh_to_vm`,
`ssh_to_tpu`) — used by `scripts/iris/dev_tpu.py` (backs the `dev-tpu`
skill) and `scripts/gcp-ssh`.
- Clean stale `RayClusterConfig` / `update_cluster_configs` exports from
`lib/marin/src/marin/cluster/__init__.py`.
Total: 535 LOC removed, no new code.
## Audit
Performed before deletion on `origin/main` (HEAD `e10e14055`).
| Symbol | External callers | Action |
|---|---|---|
| `marin.cluster.config` module | 0 (grep `marin\.cluster\.config\|from
marin\.cluster import config\|cluster\.config\.`) | delete |
| `RayClusterConfig` | 0 outside `cluster/__init__.py` and
`cluster/config.py` | stale export, remove |
| `update_cluster_configs` | 0 outside `cluster/__init__.py` and
`cluster/config.py` | stale export, remove |
| `terminate_tpus_in_cluster` | 0 outside `gcp.py` itself | delete |
| `terminate_head_node` | 0 outside `gcp.py` itself | delete |
| `delete_tpu_node` | 1, and it's `terminate_tpus_in_cluster` inside
`gcp.py` (line 229) | delete |
| `from marin.cluster import gcp` | 2: `scripts/iris/dev_tpu.py`,
`scripts/gcp-ssh` | keep module |
Post-deletion grep: zero hits for all deleted names in the tree, and the
two `marin.cluster.gcp` importers are intact.
### Notes on unused-but-kept helpers
These generic helpers have zero external callers right now. They are
retained per review instructions — the user will decide whether to prune
them:
- `get_default_zone`: not called anywhere in marin (levanter has its own
copy at `lib/levanter/src/levanter/infra/cli_helpers.py:78`).
- `list_instances`: its only internal caller was `terminate_head_node`
(now deleted). No external callers.
All other "keep" helpers have live callers either externally (scripts)
or internally (`run_gcloud_command`, `list_tpu_nodes` used by
`find_tpu_by_ip`).
## Test plan
- [x] `rg 'marin\.cluster\.config\|from marin\.cluster import
config\|cluster\.config\.'` -> 0 hits
- [x] `rg
'terminate_tpus_in_cluster\|terminate_head_node\|delete_tpu_node'` -> 0
hits
- [x] `rg 'from marin\.cluster import gcp\|marin\.cluster\.gcp'` -> 2
hits, both in `scripts/`
- [x] `rg 'RayClusterConfig\|update_cluster_configs'` -> 0 hits
- [x] `uv run scripts/iris/dev_tpu.py --help` runs cleanly (lists
allocate/connect/execute/release/setup_env/status/watch)
- [x] `uv run python -c 'import marin.cluster; import
marin.cluster.gcp'` -> imports ok
- [x] `./infra/pre-commit.py --all-files --fix` passes (ruff, black,
license headers, pyrefly, AST, merges, TOML/YAML, trailing whitespace,
EOF newlines, notebooks, markdown)
---------
Co-authored-by: Romain Yon <1596570+yonromai@users.noreply.github.com>1 parent 4693e5e commit 003de39
3 files changed
Lines changed: 0 additions & 566 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | 7 | | |
9 | 8 | | |
10 | | - | |
11 | 9 | | |
12 | | - | |
13 | 10 | | |
0 commit comments