spilling peak and average by xinyuangui2 · Pull Request #60809 · ray-project/ray

xinyuangui2 · 2026-02-06T20:10:42Z

Thank you for contributing to Ray! 🚀
Please review the Ray Contribution Guide before opening a pull request.

⚠️ Remove these instructions before submitting your PR.

💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete.

Description

Briefly describe what this PR accomplishes and why it's needed.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: xgui <xgui@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces a SpillMetricsMonitor actor to compute and report peak and average object store spilling rates, which is a valuable addition for performance monitoring in benchmarks. The implementation uses a detached Ray actor with a background polling thread, which is a suitable design. The integration into RayDataLoaderFactory is clean. My review includes a couple of suggestions to enhance the robustness of the metric calculation and improve code consistency.

gemini-code-assist · 2026-02-06T20:11:57Z

release/train_tests/benchmark/ray_dataloader_factory.py

+        )
+        return memory_info.store_stats.spilled_bytes_total
+
+    def _poll_loop(self):


The _poll_loop method is missing a return type hint. For consistency with the rest of the codebase's type annotations, it should be specified. Since this method doesn't return a value, the hint should be -> None.

Suggested change

def _poll_loop(self):

def _poll_loop(self) -> None:

gemini-code-assist · 2026-02-06T20:11:57Z

release/train_tests/benchmark/ray_dataloader_factory.py

+                if delta_time > 0:
+                    rate_gb_s = (delta_bytes / (1024**3)) / delta_time
+                    with self._lock:
+                        self._spill_rates_gb_s.append(rate_gb_s)


The spilled_bytes_total counter could theoretically reset (e.g., on GCS restart), which would cause delta_bytes to be negative. This would result in a negative spill rate being recorded, skewing the average calculation. It's safer to only calculate the rate for non-negative delta_bytes.

Suggested change

if delta_time > 0:

rate_gb_s = (delta_bytes / (1024**3)) / delta_time

with self._lock:

self._spill_rates_gb_s.append(rate_gb_s)

if delta_time > 0 and delta_bytes >= 0:

rate_gb_s = (delta_bytes / (1024**3)) / delta_time

with self._lock:

self._spill_rates_gb_s.append(rate_gb_s)

spilling peak and average

081ef11

Signed-off-by: xgui <xgui@anyscale.com>

gemini-code-assist bot reviewed Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spilling peak and average#60809

spilling peak and average#60809
xinyuangui2 wants to merge 1 commit intoray-project:masterfrom
xinyuangui2:spilling-peak

xinyuangui2 commented Feb 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 6, 2026

Uh oh!

gemini-code-assist bot Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xinyuangui2 commented Feb 6, 2026

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant