[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` #30681

jikunshang · 2025-12-15T08:24:11Z

Purpose

vLLM is a framework support multi hardware backend. while there are some torch.cuda hard call. this is unfriendly to non-cuda compatible device. fortunately, there is a new set of torch.accelerator API which can dispatch based on platform.
I will try to create a series of PR to address this issue, start from empty_cache API.

Test Plan

CI.

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Kunshang Ji <[email protected]>

chatgpt-codex-connector · 2025-12-15T08:24:16Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

mergify · 2025-12-15T08:24:46Z

Documentation preview: https://vllm--30681.org.readthedocs.build/en/30681/

gemini-code-assist

Code Review

This pull request replaces direct calls to torch.cuda.empty_cache() with the more hardware-agnostic torch.accelerator.empty_cache(). This is a good step towards making vLLM compatible with non-CUDA devices. The changes are applied consistently across the codebase, including in examples, utility functions, and core model execution logic. A new pre-commit hook is also added to prevent future usage of torch.cuda.empty_cache. My review focuses on the implementation of this new pre-commit hook, where I've found a couple of critical issues that would prevent it from working as intended. I've provided suggestions to fix them. The rest of the changes look good.

gemini-code-assist · 2025-12-15T08:25:53Z

tools/pre_commit/check_torch_cuda.py

+ALLOWED_FILES = {"tests/", "benchmarks/", "vllm/platforms/*"}
+
+
+def is_allowed_file(current_file: str) -> bool:
+    return current_file in ALLOWED_FILES


The check current_file in ALLOWED_FILES performs an exact match, but ALLOWED_FILES contains directory prefixes. For example, a file tests/models/test_llama.py will not be matched against tests/. This will cause the pre-commit hook to incorrectly flag files that should be allowed. You should use startswith to check if the file path is under one of the allowed directories. Also, vllm/platforms/* seems to intend to match all files in the directory, so it should probably be vllm/platforms/.

Suggested change

ALLOWED_FILES = {"tests/", "benchmarks/", "vllm/platforms/*"}

def is_allowed_file(current_file: str) -> bool:

return current_file in ALLOWED_FILES

ALLOWED_FILES = {"tests/", "benchmarks/", "vllm/platforms/"}

def is_allowed_file(current_file: str) -> bool:

return any(current_file.startswith(p) for p in ALLOWED_FILES)

gemini-code-assist · 2025-12-15T08:25:53Z

tools/pre_commit/check_torch_cuda.py

+def is_forbidden_torch_cuda_api(line: str) -> bool:
+    stripped = line.strip()
+    return bool(_TORCH_CUDA_RE.match(stripped))


re.match only checks for a match at the beginning of the string. This will fail to detect forbidden API calls that are not at the start of a line (after stripping whitespace), for example x = torch.cuda.empty_cache(). You should use re.search to find a match anywhere in the line.

Suggested change

def is_forbidden_torch_cuda_api(line: str) -> bool:

stripped = line.strip()

return bool(_TORCH_CUDA_RE.match(stripped))

def is_forbidden_torch_cuda_api(line: str) -> bool:

return bool(_TORCH_CUDA_RE.search(line))

hmellor · 2025-12-15T13:20:30Z

.pre-commit-config.yaml

+    entry: python tools/pre_commit/check_torch_cuda.py
+    language: python
+    types: [python]
+    pass_filenames: false


We should pass file names, otherwise this will run on all files every commit

Suggested change

pass_filenames: false

hmellor · 2025-12-15T13:22:16Z

tools/pre_commit/check_torch_cuda.py

The hook doesn't need to work out which files it should run on, pre-commit already does that for us. Please look at tools/pre_commit/check_pickle_imports.py to see how this should be done

Signed-off-by: Kunshang Ji <[email protected]>

jikunshang added 2 commits December 15, 2025 00:01

replace torch.cuda.empty_cache with torch.accelerator.empty_cache

2ab70c4

Signed-off-by: Kunshang Ji <[email protected]>

fix example files

629620b

Signed-off-by: Kunshang Ji <[email protected]>

jikunshang requested review from 22quinn, ProExpertProg, WoosukKwon, hmellor, mgoin, pavanimajety, robertgshaw2-redhat, tjtanaa, tlrmchlsmth, yewentao256, youkaichao and zou3519 as code owners December 15, 2025 08:24

jikunshang mentioned this pull request Dec 15, 2025

[RFC]: Replace torch.cuda API with torch.accelerator for better hardware compatiblity. #30679

Open

1 task

mergify bot added documentation Improvements or additions to documentation nvidia v1 labels Dec 15, 2025

github-project-automation bot added this to NVIDIA Dec 15, 2025

jikunshang marked this pull request as draft December 15, 2025 08:25

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

hmellor requested changes Dec 15, 2025

View reviewed changes

github-project-automation bot moved this to In review in NVIDIA Dec 15, 2025

jikunshang added 3 commits December 15, 2025 18:46

address comments

070ba5a

Signed-off-by: Kunshang Ji <[email protected]>

address comments

304b967

Signed-off-by: Kunshang Ji <[email protected]>

fix file match

57150c7

Signed-off-by: Kunshang Ji <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` #30681

[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` #30681

jikunshang commented Dec 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 15, 2025

Uh oh!

mergify bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

hmellor Dec 15, 2025

Uh oh!

hmellor Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache #30681

Are you sure you want to change the base?

[Hardware] Replace torch.cuda.empty_cache with torch.accelerator.empty_cache #30681

Conversation

jikunshang commented Dec 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot commented Dec 15, 2025

Uh oh!

mergify bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

hmellor Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

hmellor Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` #30681

[Hardware] Replace `torch.cuda.empty_cache` with `torch.accelerator.empty_cache` #30681

jikunshang commented Dec 15, 2025 •

edited by github-actions bot

Loading