[RFC]: Replace `torch.cuda` API with `torch.accelerator` for better hardware compatiblity.

### Motivation.

vLLM is a framework support multi hardware backend. while there are some torch.cuda hard code call. this is unfriendly to non-cuda compatible device. Fortunately, there is a new set of `torch.accelerator` [API](https://docs.pytorch.org/docs/stable/accelerator.html) in pytorch which can dispatch based on platform.

Meanwhile, we should add some lint tool to avoid add more `torch.cuda` call in new added code.

### Proposed Change.

torch accelerator API status on torch-2.9.0:

|cuda API name| unified torch API name| torch API Status | vLLM replace status| 
|-------|-------|-------|-------|
|torch.cuda.Event| torch.Event | | https://github.com/vllm-project/vllm/pull/26985 |
|torch.cuda.Stream| torch.Stream | | |
|torch.cuda.device_count |torch.accelerator.device_count  | | |
|torch.cuda.is_available |torch.accelerator.is_available | | |
|torch.cuda.synchronize | torch.accelerator.synchronize| | |
|torch.cuda.set_stream | torch.accelerator.set_stream| | |
|torch.cuda.current_device | torch.accelerator.current_accelerator| | |
|torch.cuda.current_stream |torch.accelerator.current_stream | | |
|torch.cuda.empty_cache | torch.accelerator.empty_cache | | https://github.com/vllm-project/vllm/pull/30681 |
|torch.cuda.max_memory_allocated | torch.accelerator.max_memory_allocated | | |
|torch.cuda.max_memory_reserved | torch.accelerator.max_memory_reserved  | | |
|torch.cuda.memory_allocated | torch.accelerator.memory_allocated | | |
|torch.cuda.memory_reserved | torch.accelerator.memory_reserved  | | |
|torch.cuda.memory_stats | torch.accelerator.memory_stats | | |
|torch.cuda.reset_peak_memory_stats | torch.accelerator.reset_peak_memory_stats | | |

### Feedback Period.

_No response_

### CC List.

@youkaichao @simon-mo @WoosukKwon @zhuohan123 

### Any Other Things.

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[RFC]: Replace `torch.cuda` API with `torch.accelerator` for better hardware compatiblity. #30679

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cuda API name	unified torch API name	vLLM replace status
torch.cuda.Event	torch.Event	#26985
torch.cuda.Stream	torch.Stream
torch.cuda.device_count	torch.accelerator.device_count
torch.cuda.is_available	torch.accelerator.is_available
torch.cuda.synchronize	torch.accelerator.synchronize
torch.cuda.set_stream	torch.accelerator.set_stream
torch.cuda.current_device	torch.accelerator.current_accelerator
torch.cuda.current_stream	torch.accelerator.current_stream
torch.cuda.empty_cache	torch.accelerator.empty_cache	#30681
torch.cuda.max_memory_allocated	torch.accelerator.max_memory_allocated
torch.cuda.max_memory_reserved	torch.accelerator.max_memory_reserved
torch.cuda.memory_allocated	torch.accelerator.memory_allocated
torch.cuda.memory_reserved	torch.accelerator.memory_reserved
torch.cuda.memory_stats	torch.accelerator.memory_stats
torch.cuda.reset_peak_memory_stats	torch.accelerator.reset_peak_memory_stats

Uh oh!

[RFC]: Replace torch.cuda API with torch.accelerator for better hardware compatiblity. #30679

Description

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC]: Replace `torch.cuda` API with `torch.accelerator` for better hardware compatiblity. #30679