Skip to content
Merged
Show file tree
Hide file tree
Changes from 91 commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
b11f807
[autoscaler] Clarify head node commands in ray up output
cyhapun May 17, 2026
6882ea2
[CI] Skip premerge tests for doc/*.rst changes (#63198)
andrew-anyscale May 14, 2026
89f64d0
[Data][LLM] Fix max_pending_requests default to track vLLM's GPU-depe…
Aydin-ab May 14, 2026
ae91f05
[core][train][data] Keep strong references to fire-and-forget asyncio…
XuQianJin-Stars May 14, 2026
5b744e4
[Core] Fix OOM kill msg wrong threshold w/ resource isolation (#62948)
owenowenisme May 14, 2026
9424218
[core] Fix accelerator detection on NVIDIA Blackwell consumer GPUs (#…
micah-yong-ai May 14, 2026
39391dc
[autoscaler][Data] Accept fractional resource values in request_resou…
liulehui May 14, 2026
6800a6b
Fix copy message typo in retry invoke (#63347)
werkt May 14, 2026
e2cdfe6
[serve] Deflake serve windows tests (#63281)
jeffreywang88 May 14, 2026
4ad7b90
[Data] Shut down executor when DataIterator exits early (#62949)
xinyuangui2 May 15, 2026
dc38386
[Data] Fix silent credential drop for fsspec-S3 in download expressio…
xyuzh May 15, 2026
35a4d44
[RLlib] fix duplicated "that" in collector and env_runner comments (#…
oab24413gmai May 15, 2026
cb70ca7
[core] Deflake test_placement_group_status (#63288)
Yicheng-Lu-llll May 15, 2026
31002a9
[RLlib] Add `custom_resources_per_learner` config (#63303)
ArturNiederfahrenhorst May 15, 2026
fc54a8c
[RLlib] Fix extra model outputs hanging val indexing (#62960)
Cursx May 15, 2026
5e45e21
[Core]Fix actor creation race condition of #59642 (#62994)
YoyinZyc May 15, 2026
d7ea688
[serve][llm] Replace LLM ingress router replica selection with `choos…
jeffreywang88 May 15, 2026
af10163
[serve] Add ControllerOptions for configurable controller runtime_env…
kouroshHakha May 15, 2026
e74d0f6
[deps] unifying serve test deps (#63313)
elliot-barn May 15, 2026
890218c
[data] Add `Dataset.mix()` release test microbenchmark (#63286)
justinvyu May 15, 2026
1ef9745
[serve] Fix a potential `UnboundLocalError` in `ActorReplicaWrapper.…
chenshi5012 May 15, 2026
ba89f10
[Serve] Rename loop variable to avoid shadowing dataclasses.field (#6…
XuQianJin-Stars May 15, 2026
5ba8e13
[train][data] Forward label_selector to AutoscalingCoordinator (#63287)
liulehui May 15, 2026
f6ec5d0
[Serve] Simplify background thread deployment method invocation test …
win5923 May 15, 2026
bbae81f
[docs] Fix broken Examples sidebar links on serve/llm/index page (#62…
prassanna-ravishankar May 15, 2026
6fc5058
[serve] Default RAY_SERVE_HAPROXY_TCP_NODELAY to 1 (#63353)
kouroshHakha May 15, 2026
bcac70a
[data] DataSourceV2: enable V2 by default (#63326)
goutamvenkat-anyscale May 15, 2026
3356531
[serve][llm] Configurable session-id header (RAY_SERVE_SESSION_ID_HEA…
kouroshHakha May 15, 2026
4704596
[Serve] Fix replica constructor failure tests (#62862)
vaishdho1 May 15, 2026
d67e746
[Data] Display logical memory in progress bar (#63379)
bveeramani May 15, 2026
bb2c57a
[Data] Log RAY_DATA environment variables at execution start (#63380)
bveeramani May 16, 2026
df71bce
[Serve] Mark widely-used APIs as stable (#62932)
win5923 May 16, 2026
1a3fa8f
[Serve] Run health check on user execution path to detect request-ser…
claytonlin1110 May 16, 2026
a491d0f
[Data] Add scheduling-loop max metric to DatasetStatsSummary (#63345)
xinyuangui2 May 16, 2026
1e8f1d2
[Data] Remove docs recommending increased object store memory proport…
bveeramani May 16, 2026
32be122
[LLM] Fix misleading ImportError when vLLM is installed but fails to …
ps2181 May 17, 2026
1f94a53
[runtime_env] Support .tar.gz archives for remote working_dir URIs (#…
ankushbbbr May 17, 2026
174355b
fix: broken ray data link readme (#63412)
srini047 May 18, 2026
0317501
[Core] deflake test_autoscaler_e2e (#63421)
sampan-s-nayak May 18, 2026
6e9c366
[RLlib] Deflake test_actor_manager with 30 sec timeout (#63423)
ArturNiederfahrenhorst May 18, 2026
0bc67db
[autoscaler] Add BUILD target for test_cli_output_context
Hutaph May 18, 2026
25f3eb6
[ci] Migrate tune test compute configs to new schema (#62879)
sai-miduthuri May 18, 2026
644cbe7
[ci] Migrate serve and runtime_env compute configs to new schema (#62…
sai-miduthuri May 18, 2026
ddac3bf
[autoscaler] Add Bazel target for CLI output context test
Hutaph May 18, 2026
cdcd6bd
[autoscaler] Add runfiles for CLI output context test
Hutaph May 19, 2026
9919f2d
[ci] Migrate specialized dataset compute configs to new schema (#62863)
sai-miduthuri May 18, 2026
20c0900
[doc][DOC-933] add sphinxext-opengraph (#63343)
dstrodtman May 18, 2026
93be956
[doc][DOC-941] unpin tf-keras: 2.20.0 was yanked (#63358)
dstrodtman May 18, 2026
683f768
[core] Print function context when inspecting closures in inspect_ser…
prince8273 May 18, 2026
d2ce7bd
[Docs] Fix link text/target inconsistencies in cluster/key-concepts.r…
dstrodtman May 18, 2026
2d3dc2f
[core] Warn when runtime_env package approaches upload size limit (#6…
lonexreb May 18, 2026
e5ec874
[Data] Deprecate ConcurrencyCapBackpressurePolicy (#63392)
bveeramani May 18, 2026
572af92
[Train] Fix all of Train's pydoclint issues (#63365)
pseudo-rnd-thoughts May 18, 2026
8d72590
[Data] Fail multimodal release tests on worker OOM by default (#63472)
bveeramani May 18, 2026
fdf3800
[Data] Add wide-schema worker_scaling release tests across [500, 1000…
xinyuangui2 May 18, 2026
831010a
[Data] Fail map_batches benchmark on worker OOM (#63474)
bveeramani May 19, 2026
fd647fd
[serve] Optional HAProxy retry knobs for ingress-request-router backe…
kouroshHakha May 19, 2026
9f26053
AMD GPU: Replace rocm-smi ctypes binding with amd-smi Python interfac…
adam360x May 19, 2026
cfc2501
[Data] Gate unsafe deserialization in WebDataset default decoder (#63…
bveeramani May 19, 2026
a12779c
[data] DataSourceV2: don't disable pre_buffer in Parquet scanner opti…
goutamvenkat-anyscale May 19, 2026
6f68d1f
[RLlib] Fix offline prelearner test (#63495)
ArturNiederfahrenhorst May 19, 2026
1e9976c
[core] Fix `test_task_events.py::test_failed_task_runtime_env_setup` …
edoakes May 19, 2026
284f1e2
[RLlib] Mark tests as not flakey (#63426)
ArturNiederfahrenhorst May 19, 2026
73db78b
[RLlib] Fix test node failures teardown (#63491)
ArturNiederfahrenhorst May 19, 2026
84e7358
[Data] Block pickle object columns when reading untrusted Parquet fil…
bveeramani May 19, 2026
dadac97
[Data] Cache deserialized Arrow schemas in BlockMetadataWithSchema (#…
xinyuangui2 May 19, 2026
d59dc85
[Train] Remove Predictor from train v1 (#63461)
pseudo-rnd-thoughts May 19, 2026
d2fe245
[doc][DOC-934] bump pydata-sphinx-theme 0.14.1 -> 0.17.1 (#63344)
dstrodtman May 19, 2026
8fde4f7
[doc][DOC-942] add language_info.name to async-inference notebook (#6…
dstrodtman May 19, 2026
883620c
[doc][DOC-935] bump myst-nb 1.0.0rc0 -> 1.4.0 (#63360)
dstrodtman May 19, 2026
d405ce1
[Data] Make logging configurable by RAY_DATA_LOG_LEVEL env var (#63487)
bveeramani May 19, 2026
3d04cb8
[Core] Minor cleanup in cluster_resource_data (#63399)
dancingactor May 19, 2026
47d9429
Fix runtime env cache not detecting changes in requirements.txt files…
Lucas61000 May 19, 2026
f664b5b
[serve][ci] Split flaky tests into CPU and GPU steps (#63372)
harshit-anyscale May 19, 2026
9e1e07b
[data] add `read_parquet` metadata fetch memory regression test (#63376)
justinvyu May 19, 2026
1a48813
[Data] Replace TaskDurationStats with DistributionTracker (#63488)
bveeramani May 19, 2026
98bfef9
[Data] Move test_read_parquet_v2 out of unit tests (#63522)
goutamvenkat-anyscale May 19, 2026
5aca5c7
Removed deprecated DeploymentMode (#63510)
johntaylor-cell May 19, 2026
1e9264b
[doc][DOC-1044] restore active-page sidebar expansion under pydata 0.…
dstrodtman May 19, 2026
d1017e9
[Data] DataSourceV2: keep deprecated read_parquet args working with w…
goutamvenkat-anyscale May 20, 2026
0da5691
[Data] Use precise return type for DistributionTracker.as_dict (#63530)
bveeramani May 20, 2026
c0e8c4a
[Data] Track peak USS memory per task via DistributionTracker (#63489)
bveeramani May 20, 2026
f040c3d
[data][llm] Create data in input dir for the checkpointing doc exampl…
jeffreywang88 May 20, 2026
96f43d7
[Data] Remove column renaming from the read stage (#63384)
goutamvenkat-anyscale May 20, 2026
e59fd2e
[RLlib] Put only one copy of weights into object store (#63529)
ArturNiederfahrenhorst May 20, 2026
b56e915
[jobs] Surface WebSocket close codes and errors in job log streaming …
ChangyuWang May 20, 2026
e0f2aca
[core] Improve WARNING message in inspect_serializability with action…
prince8273 May 20, 2026
3d55b13
[data] Stabilize parquet memory growth regression test
TruongQuangPhat May 27, 2026
eeeecaf
Merge upstream master and resolve conflicts
TruongQuangPhat May 27, 2026
4a5e3d0
Move pure Python tests to unit test directories
TruongQuangPhat May 27, 2026
50fb1fe
Merge branch 'master' into master
Hutaph May 27, 2026
9e1890b
[data][ci] Fix lint failures from parquet schema test move
Hutaph May 27, 2026
761a944
[core][ci] Remove duplicate GCS actor manager tests
Hutaph May 27, 2026
bb4edb9
Merge branch 'master' into master
Hutaph May 28, 2026
8ec066c
[core][data] Fix microcheck unit test failures
Hutaph May 28, 2026
25fca55
Merge branch 'master' into master
edoakes May 28, 2026
c724a65
Merge branch 'master' into master
Hutaph May 29, 2026
5ec7ad4
[data] Fix pyrefly check for parquet schema inference test
Hutaph May 29, 2026
4c36f36
Merge branch 'master' into master
Hutaph May 30, 2026
9289d35
[autoscaler] Remove unrelated data changes
Hutaph May 30, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions python/ray/autoscaler/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,12 @@ filegroup(
]),
visibility = ["//:__pkg__"],
)

filegroup(
name = "cli_output_context_test_data",
srcs = [
"_private/cli_output_helpers.py",
"_private/commands.py",
],
visibility = ["//python/ray/tests:__pkg__"],
)
Comment thread
cursor[bot] marked this conversation as resolved.
39 changes: 39 additions & 0 deletions python/ray/autoscaler/_private/cli_output_helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
"""Lightweight CLI output helpers for cluster launcher context notes.

These functions produce informational output that disambiguates head-node
commands from local-machine commands. They have NO dependency on the ``ray``
C++ runtime and accept ``cli_logger`` / ``cf`` as explicit parameters so they
can be tested in isolation.
"""


def print_next_steps_context_note(cli_logger, cf):
"""Print a dimmed note at the top of the 'Next steps' block.

Informs the user that the commands below are intended for the head node
or for machines within the cluster network.
"""
cli_logger.print(cf.dimmed("Note: The following commands are intended for use on"))
cli_logger.print(cf.dimmed("the head node or within the cluster network."))
cli_logger.newline()


def print_head_node_context_separator(cli_logger, cf):
"""Print a visual separator and context note after head-node output.

Used by ``ray up`` to separate the streamed ``ray start`` output from
the local-machine commands that follow.
"""
cli_logger.print(cf.dimmed("-" * 60))
cli_logger.print(
cf.dimmed("Note: The output above is from the head node (via `ray start`).")
)
cli_logger.print(
cf.dimmed(" Commands shown in 'Next steps' may only work from the head node")
)
Comment thread
cursor[bot] marked this conversation as resolved.
cli_logger.print(cf.dimmed(" or from within the cluster network."))
cli_logger.newline()


# Group heading used by ``ray up`` for the local commands section.
USEFUL_COMMANDS_HEADING = "Useful commands for your local machine:"
8 changes: 7 additions & 1 deletion python/ray/autoscaler/_private/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@
from ray.autoscaler._private import subprocess_output_util as cmd_output_util
from ray.autoscaler._private.autoscaler import AutoscalerSummary
from ray.autoscaler._private.cli_logger import cf, cli_logger
from ray.autoscaler._private.cli_output_helpers import (
USEFUL_COMMANDS_HEADING,
print_head_node_context_separator,
)
from ray.autoscaler._private.cluster_dump import (
Archive,
GetParameters,
Expand Down Expand Up @@ -932,7 +936,9 @@ def get_or_create_head_node(
modifiers = ""

cli_logger.newline()
with cli_logger.group("Useful commands:"):
if ray_start_commands:
print_head_node_context_separator(cli_logger, cf)
with cli_logger.group(USEFUL_COMMANDS_HEADING):
Comment thread
cursor[bot] marked this conversation as resolved.
printable_config_file = os.path.abspath(printable_config_file)

cli_logger.print("To terminate the cluster:")
Expand Down
3 changes: 2 additions & 1 deletion python/ray/data/tests/datasource/test_parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import pandas as pd
import pyarrow as pa
import pyarrow.dataset as pds
import pyarrow.fs as pafs
Comment thread
cursor[bot] marked this conversation as resolved.
Outdated
import pyarrow.parquet as pq
import pytest
from packaging.version import parse as parse_version
Expand All @@ -24,6 +25,7 @@
_MAX_PYARROW_TO_BATCHES_BATCH_SIZE,
ParquetDatasource,
_coerce_pyarrow_fragment_batch_size,
_infer_schema,
_read_batches_from,
)
from ray.data._internal.execution.interfaces.ref_bundle import (
Expand Down Expand Up @@ -1779,7 +1781,6 @@ def test_read_null_data_in_first_file(
{"data": "spam"},
]


def test_read_parquet_does_not_call_infer_schema(
tmp_path, monkeypatch, ray_start_regular_shared
):
Expand Down
68 changes: 68 additions & 0 deletions python/ray/data/tests/unit/test_parquet_schema_inference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import pytest
import pyarrow as pa
import pyarrow.dataset as pds
import pyarrow.fs as pafs
import pyarrow.parquet as pq
from packaging.version import parse as parse_version

from ray._common.utils import get_pyarrow_version
from ray.data._internal.datasource.parquet_datasource import _infer_schema

def test_read_parquet_memory_growth(tmp_path, monkeypatch):
"""Schema inference should not inspect every fragment on PyArrow >= 22.

Regression test for a bug where _infer_schema fell back to reading every
fragment's physical_schema when the sampled fragment had a pa.null() column
(PyArrow < 22.0), causing O(N) metadata reads and memory usage.
"""
if get_pyarrow_version() < parse_version("22.0.0"):
pytest.skip("Bounded permissive schema inspection requires PyArrow >= 22.0.0")

num_cols = 50
num_files = 1000
inspect_num_fragments = 1

def _write_files(directory, n_files):
directory.mkdir(exist_ok=True)
for i in range(n_files):
cols = {f"col_{j}": [0] for j in range(num_cols)}
# First file has a column of all nulls, which triggers the schema inference fallback.
if i == 0:
cols["null_col"] = pa.nulls(1)
else:
cols["null_col"] = [1]
pq.write_table(pa.table(cols), directory / f"part_{i:05d}.parquet")

_write_files(tmp_path, num_files)

inspect_calls = []
real_factory = pds.FileSystemDatasetFactory

# RSS deltas for this code path are sub-MiB in CI, so check the bounded
# schema-inspection behavior directly instead of comparing process memory.
class TrackingFactory:
def __init__(self, *args, **kwargs):
self._factory = real_factory(*args, **kwargs)

def inspect(self, **kwargs):
inspect_calls.append(kwargs)
return self._factory.inspect(**kwargs)

def finish(self, *args, **kwargs):
pytest.fail("Schema inference should not inspect every fragment")

monkeypatch.setattr(pds, "FileSystemDatasetFactory", TrackingFactory)

schema = _infer_schema(
[str(path) for path in sorted(tmp_path.iterdir())],
inspect_num_fragments=inspect_num_fragments,
filesystem=pafs.LocalFileSystem(),
)

assert inspect_calls == [
{
"fragments": inspect_num_fragments,
"promote_options": "permissive",
}
]
assert "null_col" in schema.names
2 changes: 2 additions & 0 deletions python/ray/scripts/scripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
parse_resources_json,
)
from ray.autoscaler._private.cli_logger import add_click_logging_options, cf, cli_logger
from ray.autoscaler._private.cli_output_helpers import print_next_steps_context_note
from ray.autoscaler._private.commands import (
RUN_ENV_TYPES,
attach_cluster,
Expand Down Expand Up @@ -1082,6 +1083,7 @@ def start(
cli_logger.success("-" * len(startup_msg))
cli_logger.newline()
with cli_logger.group("Next steps"):
print_next_steps_context_note(cli_logger, cf)
Comment thread
cursor[bot] marked this conversation as resolved.
dashboard_url = node.address_info["webui_url"]
if ray_constants.ENABLE_RAY_CLUSTER:
cli_logger.print("To add another node to this Ray cluster, run")
Expand Down
16 changes: 16 additions & 0 deletions python/ray/tests/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,22 @@ py_test_module_list(
],
)

py_test(
name = "test_cli_output_context",
size = "small",
srcs = ["test_cli_output_context.py"],
data = ["//python/ray/autoscaler:cli_output_context_test_data"],
tags = [
"exclusive",
"small_size_python_tests",
"team:core",
],
deps = [
":conftest",
"//:ray_lib",
],
)

py_test_module_list(
size = "small",
files = [
Expand Down
3 changes: 3 additions & 0 deletions python/ray/tests/test_cli_patterns/test_ray_start.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ Ray runtime started.
--------------------

Next steps
Note: The following commands are intended for use on
the head node or within the cluster network\.

To add another node to this Ray cluster, run
ray start --address='.+'

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Ray runtime started.
--------------------

Next steps
Note: The following commands are intended for use on
the head node or within the cluster network\.

To add another node to this Ray cluster, run
RAY_ENABLE_WINDOWS_OR_OSX_CLUSTER=1 ray start --address='.+'

Expand Down
7 changes: 6 additions & 1 deletion python/ray/tests/test_cli_patterns/test_ray_up.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,12 @@ Acquiring an up-to-date head node
\[7/7\] Starting the Ray runtime
New status: up-to-date

Useful commands:
-{60}
Note: The output above is from the head node \(via `ray start`\)\.
Commands shown in 'Next steps' may only work from the head node
or from within the cluster network\.

Useful commands for your local machine:
To terminate the cluster:
ray down .+

Expand Down
7 changes: 6 additions & 1 deletion python/ray/tests/test_cli_patterns/test_ray_up_docker.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,12 @@ Acquiring an up-to-date head node
\[7/7\] Starting the Ray runtime
New status: up-to-date

Useful commands:
-{60}
Note: The output above is from the head node \(via `ray start`\)\.
Commands shown in 'Next steps' may only work from the head node
or from within the cluster network\.

Useful commands for your local machine:
To terminate the cluster:
ray down .+

Expand Down
6 changes: 5 additions & 1 deletion python/ray/tests/test_cli_patterns/test_ray_up_record.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,11 @@
.+\.py.*AWSNodeProvider: Set tag ray-runtime-config=.+ on \['.+'\] \[LogTimer=.+\]
.+\.py.*AWSNodeProvider: Set tag ray-file-mounts-contents=.+ on \['.+'\] \[LogTimer=.+\]
.+\.py.*New status: up-to-date
.+\.py.*Useful commands:
.+\.py.*-{60}
.+\.py.*Note: The output above is from the head node \(via `ray start`\)\.
.+\.py.* Commands shown in 'Next steps' may only work from the head node
.+\.py.* or from within the cluster network\.
.+\.py.*Useful commands for your local machine:
.+\.py.*To terminate the cluster:
.+\.py.* ray down .+
.+\.py.*To retrieve the IP address of the cluster head:
Expand Down
Loading
Loading