Skip to content

Conversation

@yurekami
Copy link
Contributor

@yurekami yurekami commented Dec 29, 2025

Summary

This PR contains two contributions:

1. Fix for Issue #4682 - Informative error message for generate_sequences

2. Documentation improvement for Issue #1345 - Google-style docstrings in device.py

Standardized all function docstrings in verl/utils/device.py to follow Google-style documentation format:

  • is_torch_npu_available(): Added detailed description and return type
  • get_visible_devices_keyword(): Clarified purpose and return values
  • get_device_name(): Improved description of supported devices
  • get_torch_device(): Documented fallback behavior
  • get_device_id(): Concise description with example
  • get_nccl_backend(): Explained HCCL vs NCCL selection
  • set_expandable_segments(): Added OOM context and Note section
  • auto_set_ascend_device_name(): Documented NPU auto-configuration
  • get_device_capability(): Added proper type hints and description

Fixes #4682
Contributes to #1345

yurekami and others added 2 commits December 30, 2025 07:01
…olcengine#4682)

The vLLMAsyncRollout.generate_sequences() method now provides a clear
error message explaining:
- Sync mode was retired in PR volcengine#4411
- Users should use async server interface (vLLMReplica, AsyncLLMServerManager)
- HFRollout can be used for synchronous generation

Also updated generation.yaml config to use async mode and document
the current limitation with main_generation.py workflow.

Fixes volcengine#4682

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…olcengine#1345)

Standardized all function docstrings in verl/utils/device.py to follow
Google-style documentation format with proper Args, Returns, and Note
sections:

- is_torch_npu_available(): Added detailed description and return type
- get_visible_devices_keyword(): Clarified purpose and return values
- get_device_name(): Improved description of supported devices
- get_torch_device(): Documented fallback behavior
- get_device_id(): Concise description with example
- get_nccl_backend(): Explained HCCL vs NCCL selection
- set_expandable_segments(): Added OOM context and Note section
- auto_set_ascend_device_name(): Documented NPU auto-configuration
- get_device_capability(): Added proper type hints and description

Contributes to volcengine#1345

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request delivers two valuable improvements. Firstly, it significantly enhances the developer experience by providing a much more informative error message for the deprecated synchronous generate_sequences method in vLLMAsyncRollout. This will prevent confusion and guide users toward the correct asynchronous implementation. Secondly, the standardization of docstrings in verl/utils/device.py to the Google style guide is a great step towards better code clarity and maintainability. I have added one high-severity comment to address a regression in type hinting within device.py to ensure consistency with the improvements being made. Overall, this is a solid contribution.


def get_torch_device() -> any:
"""Return the corresponding torch attribute based on the device type string.
def get_torch_device():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While the previous type hint -> any was incorrect (it should be -> Any), removing the type hint altogether is a regression, especially in a PR that aims to improve documentation and typing. To maintain consistency with the other functions in this file and improve code clarity for developers and static analysis tools, a correct return type hint should be provided.

The function returns a module object (e.g., torch.cuda), so types.ModuleType is the most appropriate type hint. You will need to add import types at the top of the file.

Suggested change
def get_torch_device():
def get_torch_device() -> "types.ModuleType":

@vermouth1992 vermouth1992 merged commit f53e2e5 into volcengine:main Dec 30, 2025
49 of 58 checks passed
boren-ms pushed a commit to boren-ms/verl that referenced this pull request Dec 30, 2025
…rings (volcengine#1345) (volcengine#4729)

## Summary

This PR contains two contributions:

### 1. Fix for Issue volcengine#4682 - Informative error message for
`generate_sequences`
- **Problem:** `vLLMAsyncRollout.generate_sequences()` raised a bare
`NotImplementedError`, leaving users confused when running generation
scripts
- **Root cause:** The vLLM SPMD (sync) mode was retired in PR volcengine#4411, but
the generation workflow (`main_generation.py`) still expects a
synchronous `generate_sequences()` method
- **Fix:** Added an informative error message explaining:
  - Sync mode was retired in PR volcengine#4411
- Users should use the async server interface (`vLLMReplica`,
`AsyncLLMServerManager`)
  - Alternative: use `HFRollout` for synchronous generation
  - Links to issue volcengine#4682 for details
- Also updated `generation.yaml` config comments to document the
limitation

### 2. Documentation improvement for Issue volcengine#1345 - Google-style
docstrings in `device.py`
Standardized all function docstrings in `verl/utils/device.py` to follow
Google-style documentation format:
- `is_torch_npu_available()`: Added detailed description and return type
- `get_visible_devices_keyword()`: Clarified purpose and return values
- `get_device_name()`: Improved description of supported devices
- `get_torch_device()`: Documented fallback behavior
- `get_device_id()`: Concise description with example
- `get_nccl_backend()`: Explained HCCL vs NCCL selection
- `set_expandable_segments()`: Added OOM context and Note section
- `auto_set_ascend_device_name()`: Documented NPU auto-configuration
- `get_device_capability()`: Added proper type hints and description

## Test plan
- [x] Python syntax verification passed for all modified files
- [ ] CI tests should pass (no functional changes, only error messages
and docstrings)

Fixes volcengine#4682
Contributes to volcengine#1345

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: yurekami <[email protected]>
Co-authored-by: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug of examples/generation/run_deepseek_v2_lite_math.sh

3 participants