[autoscaler] Clarify head node commands in ray up output#63409
Conversation
There was a problem hiding this comment.
Code Review
This pull request clarifies the context of CLI commands in Ray by adding warnings and notes to distinguish between head node and local machine operations. It also introduces a test suite that validates these changes by inspecting source code strings. Feedback suggests using standard ASCII characters for separators and dimmed text instead of warnings for consistency across the CLI. Additionally, the testing approach was noted as fragile, with a recommendation to use more robust methods like mocking the logger to avoid maintenance burdens caused by minor source code changes.
| @@ -932,7 +932,17 @@ def get_or_create_head_node( | |||
| modifiers = "" | |||
|
|
|||
| cli_logger.newline() | |||
| with cli_logger.group("Useful commands:"): | |||
| cli_logger.print( | |||
| cf.dimmed("─" * 60) | |||
There was a problem hiding this comment.
The use of the box-drawing character ─ (U+2500) is inconsistent with other separators in the codebase (e.g., python/ray/scripts/scripts.py:1087), which use the standard ASCII hyphen -. Using standard ASCII ensures better compatibility across various terminal emulators and maintains consistency with existing Ray CLI output.
| cf.dimmed("─" * 60) | |
| cf.dimmed("-" * 60) |
There was a problem hiding this comment.
Updated to use ASCII "-" separators for consistency with existing Ray CLI output.
| cli_logger.warning( | ||
| "The output above is from the head node (via `ray start`).\n" | ||
| " Commands shown in 'Next steps' may only work from the " | ||
| "head node\n" | ||
| " or from within the cluster network." | ||
| ) |
There was a problem hiding this comment.
The use of cli_logger.warning for this contextual note is inconsistent with the implementation in python/ray/scripts/scripts.py (line 1092), which uses a dimmed print. A warning typically implies a potential error or an action required by the user, whereas this is an informational note about the execution context. Additionally, warning adds a [WARN] prefix in non-TTY environments, which might be misleading here. Consider using cli_logger.print(cf.dimmed(...)) for consistency with the head-node side of the output.
There was a problem hiding this comment.
Changed this from cli_logger.warning to cli_logger.print(cf.dimmed(...)) since this is an informational context note, not a warning condition.
| """Tests for CLI output context disambiguation (ray start / ray up). | ||
|
|
||
| These tests validate the source code directly (via file reading) to avoid | ||
| requiring a full Ray build with native C++ extensions. | ||
|
|
||
| Validates that: | ||
| 1. `ray start --head` "Next steps" includes a note about head node / cluster network context. | ||
| 2. `ray up` "Useful commands" section is labeled for the local machine. | ||
| 3. `ray up` includes a warning that head node output may not work from local. | ||
| 4. `ray start --head` direct usage still works and output is valid. | ||
| """ |
There was a problem hiding this comment.
Testing CLI output by performing string searches on the source code is highly fragile and creates a maintenance burden. These tests are tightly coupled to implementation details such as variable names, formatting, and specific line contents (e.g., lines 83-87). Any refactoring or minor wording changes in the source will break these tests even if the functional output remains correct. While the desire to avoid native builds is understood, consider using a more robust approach, such as mocking the cli_logger to verify its calls, or at least using regex patterns that are less sensitive to the exact source code structure.
There was a problem hiding this comment.
Updated the tests to avoid source-code inspection. The output text is now produced by standalone helper functions, and the tests import those helpers directly with mocked cli_logger/cf objects to assert on emitted output behavior.
d364670 to
5a86880
Compare
a1ce639 to
433595f
Compare
84bc312 to
ffb9d37
Compare
85ce220 to
afbcfcd
Compare
|
Hi Ray maintainers, It looks like buildkite/microcheck failed in tests that seem unrelated to this PR's changed files, such as:
From the log, the failure appears to involve cleanup/runtime behavior, e.g. OSError: [Errno 39] Directory not empty: '/tmp/ray', while this PR only changes CLI output text/helpers and the related CLI output context tests. Could you please advise whether this should be retried or if there is anything I should adjust on my side? Thank you! |
When running 'ray up', the terminal mixes commands from two different contexts: head node commands (from 'ray start --head') and local machine commands (from 'ray up' itself). This causes confusion because users may copy-paste head node commands (with private IPs) on their local machine. Changes: - Extract output helpers into cli_output_helpers.py (no ray C++ deps) - Add dimmed note in 'Next steps' (ray start) clarifying commands are for the head node or cluster network - Add dimmed separator and context note in 'ray up' output between head node output and local commands (no [WARN] prefix) - Split context notes into separate print calls to ensure record mode appends the correct line prefix - Gate context note on ray_start_commands so it isn't printed for --no-restart - Rename 'Useful commands:' to 'Useful commands for your local machine:' - Use ASCII '-' separator for terminal compatibility - Add test_cli_output_context.py with 11 tests that call helpers directly and assert on captured cli_logger output (and AST checks for gating) - Update test_cli_patterns to match the new output strings Signed-off-by: cyhapun <cyhapun242@gmail.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
Add a pass block to test.rules.txt so changes touching doc/*.rst files no longer trigger premerge tests, and drop the corresponding doc/tutorial.rst test rule entry. This unblocks ongoing RST content rework, and can be removed once that effort concludes or if regressions surface. Topic: adjust-test-rules Signed-off-by: andrew <andrew@anyscale.com> Signed-off-by: andrew <andrew@anyscale.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
…ndent max_num_seqs (ray-project#62918) Signed-off-by: Aydin Abiar <aydin@anyscale.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
… tasks (ray-project#63291) ## Why are these changes needed? Python's asyncio documentation explicitly warns that the event loop only holds **weak** references to tasks created via `asyncio.create_task` / `loop.create_task`. A task without any other strong reference can be garbage collected at **any** time — even before it finishes: > Important: Save a reference to the result of this function, to avoid a task disappearing mid-execution. The event loop only keeps weak references to tasks. A task that isn't referenced elsewhere may be garbage collected at any time, even before it's done. > > — https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task Three call sites in the repository were violating this contract (caught by ruff's `RUF006` rule). All three create background tasks whose return value is discarded, so the resulting `Task` object is only referenced by the event loop's weak set and may be GC'd at any moment: - **`python/ray/_private/async_utils.py`** — `enable_monitor_loop_lag()` The event-loop lag monitor task could be silently collected, stopping lag monitoring without warning. - **`python/ray/train/v2/_internal/execution/checkpoint/checkpoint_manager.py`** — `CheckpointManager._notify()` The notify task wakes up coroutines waiting on `self._condition`. If it's collected before `notify_all()` runs, listeners may wait forever. - **`python/ray/data/_internal/planner/plan_udf_map_op.py`** — `_generate_transform_fn_for_async_map().._execute_transform()` The `_reorder` background task is responsible for forwarding completed results from `completed_tasks_queue` into the output queue in deterministic order. Losing this task to GC would silently break the entire async map operation. ## What was changed Each of the three sites now retains a strong reference to the created task using the pattern recommended by the asyncio docs: | File | Pattern | |---|---| | `async_utils.py` | Module-level `_BACKGROUND_TASKS: Set[asyncio.Task]` + `task.add_done_callback(_BACKGROUND_TASKS.discard)` | | `checkpoint_manager.py` | Per-instance `self._background_tasks: set` + `add_done_callback(self._background_tasks.discard)` | | `plan_udf_map_op.py` | Local variable `reorder_task = asyncio.create_task(_reorder())`, awaited in the `finally` block of `_execute_transform` (also propagates unexpected exceptions) | No public API change. No behavior change in the happy path — only correctness under GC pressure. ## Related issues N/A (silences the `RUF006` lint warnings for these three files). ## Checks - [x] I've signed off every commit (DCO). - [x] `ruff check --select RUF006` passes on the three modified files. - [x] `python -m py_compile` passes for all three files. - [ ] I've made sure the tests are passing. --------- Signed-off-by: forwardxu <forwardxu@apache.org> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
…oject#62948) ## Description The OOM kill message re-computed the memory threshold at kill time via `GetMemoryThreshold()`, which under `--enable-resource-isolation` could read a different cgroup `memory.max` value than at init time (e.g. "max" instead of digits), silently falling back to 0.95. Fix: have each monitor pass the threshold it actually fired on through the callback instead of letting NodeManager guess. Now the OOM msg should be something like this ``` Memory on the node (IP: 10.0.0.1, ID: abc123) was 7.20GB / 8.00GB (0.900000); OOM kill reason: Memory usage 7728742400B exceeded threshold of 6871947674B (85.9% of 8589934592B total); Object store memory usage: [...]; Ray killed 1 worker(s) based on the killing policy: [...]; ``` ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: You-Cheng Lin <mses010108@gmail.com> Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
…ay-project#63322) ## Description Closes ray-project#45302. Ray 2.50.x fails to register any GPU resources on hosts with NVIDIA Blackwell-class consumer GPUs (e.g. RTX 5090, driver 570.x): ``` TimeoutError: Placement group creation timed out. Make sure your cluster has enough resources. Error: No available node types can fulfill resource request {'GPU': 1.0} ``` Two independent bugs interact: **1. TPU false positive on `/dev/accel*`.** NVIDIA driver 570.x (Blackwell) creates `/dev/accel/accel0` on the host. `TPUAcceleratorManager.get_current_node_num_accelerators` uses `glob.glob("/dev/accel*")` to detect TPU chips and reports `TPU == 1`, which then steals the resource slot from the NVIDIA detector and `GPU` is never registered. Evidence chain on an RTX 5090 host (driver 570.211.01): | Layer | Sees GPU? | Detail | |-------|-----------|--------| | `nvidia-smi` | ✅ | `NVIDIA GeForce RTX 5090, 32607 MiB, 570.211.01` | | `torch.cuda` | ✅ | `device_count()=1, is_available()=True` | | Ray `NvidiaGPUAcceleratorManager` (pynvml) | ✅ | `get_current_node_num_accelerators()=1` | | Ray `TPUAcceleratorManager` | **false-positive 1** | `glob("/dev/accel*")` matches NVIDIA device file | | `ray.cluster_resources()` | **no GPU** | `{'TPU': 1.0, 'CPU': 24.0, ...}` | Fix: only count `/dev/accel*` as TPU chips when `TPU_ACCELERATOR_TYPE` is set in the environment. Real TPU VMs (GCE / GKE) always set this env var (the constant is already defined as `GKE_TPU_ACCELERATOR_TYPE_ENV_VAR` at the top of `tpu.py`). The `/dev/vfio/*` fallback for non-GKE TPU hosts is preserved. **2. NVIDIA GPU name regex captures only `"G"` on consumer cards.** `NVIDIA_GPU_NAME_PATTERN = re.compile(r"\w+\s+([A-Z0-9]+)")` was designed for datacenter cards (`"Tesla V100-SXM2-16GB"` → `"V100"`, `"NVIDIA A100-SXM4-40GB"` → `"A100"`). On a consumer card name like `"NVIDIA GeForce RTX 5090"` the regex stops at the lowercase `e` in `GeForce` and captures just `"G"`, producing a useless `accelerator_type:G` label. Fix: when the existing regex returns a result of length ≤1, fall back to a hyphen-joined product name. `"NVIDIA GeForce RTX 5090"` → `"GeForce-RTX-5090"`. The original `TODO(Alex)` comment noted this exact concern — this PR addresses it without regressing the Tesla/datacenter behavior. ### After the fix ```python >>> ray.cluster_resources() {'GPU': 1.0, 'accelerator_type:GeForce-RTX-5090': 1.0, 'CPU': 24.0, ...} ``` ## Test plan ``` pytest python/ray/tests/accelerators/test_tpu.py \ python/ray/tests/accelerators/test_nvidia_gpu.py # 71 passed locally ``` New/updated cases: - `test_autodetect_num_tpus_accel_ignored_without_tpu_env` — exercises the NVIDIA-Blackwell false-positive scenario. - `test_set_tpu_visible_ids_and_bounds` now sets `TPU_ACCELERATOR_TYPE` inside its cleared env block (matches real TPU VMs). - `test_gpu_name_to_accelerator_type` parametrized over `Tesla V100-SXM2-16GB`, `Tesla K80`, `NVIDIA A100-SXM4-40GB`, `NVIDIA H100 80GB HBM3`, `NVIDIA GeForce RTX 5090`, `NVIDIA GeForce RTX 4090`, `None`, and `""`. ## Additional information This problem will get more common as Blackwell-class hardware (consumer RTX 5xxx, plus B200 datacenter cards which also ship with driver 570.x) reaches more users. The same patch has already been validated end-to-end in an Applied Intuition fork running ray==2.50.1; opening here so the fix benefits everyone. Signed-off-by: Micah <micah@applied.co> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn>
Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com>
Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 761a944. Configure here.
Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com>
|
Hi @edoakes, could you please review this PR when you have a chance? I’ve addressed the latest CI issues:
Thank you very much! |
|
@cyhapun could you please post some screenshots of full log output for relevant commands before & after the change? |
Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com>
Thanks for asking. I tried to capture live command output locally, but my current WSL environment does not have the To still make the output change reviewable, I’m including the before/after diffs from the CLI golden pattern files used by CI:
These pattern files are what the existing CLI tests match against, so they reflect the expected full CLI output changes covered by CI. Note that some characters are escaped because these files are regex patterns, not raw terminal output.
|
edoakes
left a comment
There was a problem hiding this comment.
CLI changes LGTM but I think there are some accidental inclusions here
| import pyarrow as pa | ||
| import pyarrow.dataset as pds | ||
| import pyarrow.fs as pafs | ||
| import pyarrow.parquet as pq | ||
| import pytest |
There was a problem hiding this comment.
Hm this looks irrelevant -- was there a bad merge?
| @@ -0,0 +1,13 @@ | |||
| cloud: {{env["ANYSCALE_CLOUD_NAME"]}} | |||
Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com>
|
@edoakes Thank you for pointing this out, and I apologize for the confusion. These unrelated files were added by mistake when @TruongQuangPhat resolved merge conflicts. I have removed them from the PR. The remaining changes are only for the CLI output update. |
…#63409) ## Summary This PR clarifies the CLI output shown during `ray up`. Currently, `ray up` streams the output of `ray start --head` from the head node back to the user's local terminal. The `ray start --head` output includes "Next steps" commands that may only work from the head node or from within the cluster network, which can be confusing when viewed from the local machine running `ray up`. This change updates the output text to make the execution context clearer: - The `ray start --head` "Next steps" commands are identified as head-node / cluster-network commands. - The `ray up` "Useful Commands" section is clarified as commands intended for the local machine. - The head-node context note is only printed when `ray start` output is actually emitted, so it is not shown in `--no-restart` flows where `ray start` is skipped. - Shared CLI output helper functions were added so the wording can be reused consistently and tested without requiring a full Ray native build. fixes ray-project#40833 ## Testing - `ruff check` passed. - `pytest python/ray/tests/test_cli_output_context.py` passed: 11/11. - Tests validate emitted CLI output behavior by calling the output helpers with mocked `cli_logger` / `cf` objects. - Added coverage for the `--no-restart` gating behavior so the head-node context note is not emitted when `ray_start_commands` is empty. - Reviewed git diff after the review updates. Not run: - Manual `ray up <cluster_config.yaml>` validation, because this environment does not have live cloud provider credentials. - Direct `ray start --head` validation, because this environment does not have a fully built Ray installation with the `_raylet` native module. ## Risk Low risk: this is a CLI output text/formatting change only. The only control-flow change gates the informational context note so it is printed only when corresponding `ray start` output exists; it does not modify cluster launch logic. --------- Signed-off-by: cyhapun <cyhapun242@gmail.com> Signed-off-by: phattruong <23120318@student.hcmus.edu.vn> Signed-off-by: andrew <andrew@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com> Signed-off-by: forwardxu <forwardxu@apache.org> Signed-off-by: You-Cheng Lin <mses010108@gmail.com> Signed-off-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Signed-off-by: Micah <micah@applied.co> Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: George Gensure <werkt@users.noreply.github.com> Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com> Signed-off-by: Xinyuan <43737116+xinyuangui2@users.noreply.github.com> Signed-off-by: xyuzh <xinyzng@gmail.com> Signed-off-by: Mira Sato <275437409+oab24413gmai@users.noreply.github.com> Signed-off-by: yicheng <yicheng@anyscale.com> Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Signed-off-by: Cursx <33718736+Cursx@users.noreply.github.com> Signed-off-by: Yuchen Zhou <yczhou@google.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: chenshi5012 <chenshi5012@163.com> Signed-off-by: win5923 <ken89@kimo.com> Signed-off-by: Prass, the Nomadic coder <atemysemicolon@gmail.com> Signed-off-by: Goutam <goutam@anyscale.com> Signed-off-by: goutam <goutam@anyscale.com> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com> Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: Clayton <claytonlin1110@gmail.com> Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com> Signed-off-by: ps2181 <hellopritam31@gmail.com> Signed-off-by: Ankush Babbar <ababbar@stripe.com> Signed-off-by: Ankush Babbar <ankushbbbr@gmail.com> Signed-off-by: Sriniketh J <srinikethcr7@gmail.com> Signed-off-by: sampan <sampan@anyscale.com> Signed-off-by: Sampan S Nayak <sampansnayak2@gmail.com> Signed-off-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com> Signed-off-by: sai.miduthuri <sai.miduthuri@anyscale.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com> Signed-off-by: prince8273 <princesingh29757@gmail.com> Signed-off-by: lonexreb <reach2shubhankar@gmail.com> Signed-off-by: Mark Towers <mark@anyscale.com> Signed-off-by: Adam360x <Adam.pryor@amd.com> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: dancingactor <s990346@gmail.com> Signed-off-by: Lucas <54979663+lucas61000@users.noreply.github.com> Signed-off-by: Lucas <54979663+Lucas61000@users.noreply.github.com> Signed-off-by: john.taylor <john.taylor@anyscale.com> Signed-off-by: changyu.wang <wangchangyu315@gmail.com> Signed-off-by: Prince Kumar <princesingh29757@gmail.com> Co-authored-by: Andrew Pollack-Gray <andrew@anyscale.com> Co-authored-by: Aydin Abiar <62435714+Aydin-ab@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ForwardXu <forwardxu@apache.org> Co-authored-by: You-Cheng Lin <106612301+owenowenisme@users.noreply.github.com> Co-authored-by: Micah Yong <112517386+micah-yong-ai@users.noreply.github.com> Co-authored-by: Lehui Liu <lehui@anyscale.com> Co-authored-by: George Gensure <werkt@users.noreply.github.com> Co-authored-by: Jeffrey Wang <jeffreywang@anyscale.com> Co-authored-by: Xinyuan <43737116+xinyuangui2@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Xinyu Zhang <60529799+xyuzh@users.noreply.github.com> Co-authored-by: oab24413gmai <oab24413@gmail.com> Co-authored-by: Mira Sato <275437409+oab24413gmai@users.noreply.github.com> Co-authored-by: Yicheng-Lu-llll <51814063+Yicheng-Lu-llll@users.noreply.github.com> Co-authored-by: yicheng <yicheng@anyscale.com> Co-authored-by: Artur Niederfahrenhorst <artur@anyscale.com> Co-authored-by: Cursx <33718736+Cursx@users.noreply.github.com> Co-authored-by: Mark Towers <mark.m.towers@gmail.com> Co-authored-by: Yuchen Zhou <yczhou@google.com> Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: Elliot Barnwell <elliot.barnwell@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: chenshi <chenshi5012@163.com> Co-authored-by: Jun-Hao Wan <ken89@kimo.com> Co-authored-by: Prassanna Ravishankar <atemysemicolon@gmail.com> Co-authored-by: Goutam <goutam@anyscale.com> Co-authored-by: Vaishnavi Panchavati <38342947+vaishdho1@users.noreply.github.com> Co-authored-by: Balaji Veeramani <balaji@anyscale.com> Co-authored-by: Abrar Sheikh <abrar@anyscale.com> Co-authored-by: Clayton <118192227+claytonlin1110@users.noreply.github.com> Co-authored-by: Pritam Satpathy <54544396+ps2181@users.noreply.github.com> Co-authored-by: Ankush Babbar <ankushbbbr@gmail.com> Co-authored-by: Ankush Babbar <ababbar@stripe.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Sriniketh J <81156510+srini047@users.noreply.github.com> Co-authored-by: Sampan S Nayak <sampansnayak2@gmail.com> Co-authored-by: sampan <sampan@anyscale.com> Co-authored-by: Huynh Tan Phuoc <huynhtanphuoc164@gmail.com> Co-authored-by: Sai Miduthuri <sai.miduthuri@anyscale.com> Co-authored-by: Douglas Strodtman <douglas@anyscale.com> Co-authored-by: Prince Kumar <167613824+prince8273@users.noreply.github.com> Co-authored-by: Shubhankar Tripathy <95570942+lonexreb@users.noreply.github.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Adam Pryor <61172547+adam360x@users.noreply.github.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Co-authored-by: Ding-Xin Chen <s990346@gmail.com> Co-authored-by: Lucas <54979663+Lucas61000@users.noreply.github.com> Co-authored-by: harshit-anyscale <harshit@anyscale.com> Co-authored-by: johntaylor-cell <john.taylor@anyscale.com> Co-authored-by: Rain <wangchangyu315@126.com> Co-authored-by: phattruong <23120318@student.hcmus.edu.vn> Co-authored-by: Huynh Tan Phuoc <165295867+Hutaph@users.noreply.github.com>


Summary
This PR clarifies the CLI output shown during
ray up.Currently,
ray upstreams the output ofray start --headfrom the head node back to the user's local terminal. Theray start --headoutput includes "Next steps" commands that may only work from the head node or from within the cluster network, which can be confusing when viewed from the local machine runningray up.This change updates the output text to make the execution context clearer:
ray start --head"Next steps" commands are identified as head-node / cluster-network commands.ray up"Useful Commands" section is clarified as commands intended for the local machine.ray startoutput is actually emitted, so it is not shown in--no-restartflows whereray startis skipped.fixes #40833
Testing
ruff checkpassed.pytest python/ray/tests/test_cli_output_context.pypassed: 11/11.cli_logger/cfobjects.--no-restartgating behavior so the head-node context note is not emitted whenray_start_commandsis empty.Not run:
ray up <cluster_config.yaml>validation, because this environment does not have live cloud provider credentials.ray start --headvalidation, because this environment does not have a fully built Ray installation with the_rayletnative module.Risk
Low risk: this is a CLI output text/formatting change only. The only control-flow change gates the informational context note so it is printed only when corresponding
ray startoutput exists; it does not modify cluster launch logic.