GitHub - KaisennHu/ray-ascend

Ray Ascend Plugin

| About Ascend | Documentation |

Overview

ray-ascend is a community maintained hardware plugin to support advanced Ray features on Ascend NPU accelerators.

The default ray natively supports Ascend NPU as a pre-defined resource type to bind actors and tasks (see ray accelerator support). As an enhancement, ray-ascend provides Ascend-native features on ray, such as collective communication Huawei Collective Communication Library (HCCL), Ray Direct Transport (RDT), etc.

Prerequisites

Architecture: aarch64, x86
OS kernel: linux
Python dependencies
- python>=3.10, <=3.12
- CANN==8.2.rc1
- torch==2.7.1, torch-npu==2.7.1.post1
- ray (the same version as ray-ascend)

Version

Version	Release type	Doc
0.54.0rc1	Latest release candidate

Quick start

Installation

pip install ray-ascend[yr]

HCCL collective communication among ray actors

import ray
from ray.util import collective
from ray_ascend.collective import HCCLGroup

ray.register_collective_backend("HCCL", HCCLGroup)

collective.create_collective_group(
    actors,
    len(actors),
    list(range(0, len(actors))),
    backend="HCCL",
    group_name="my_group",
)

# each actor broadcast in a spmd manner
collective.broadcast(tensor, src_rank=0, group_name="my_group")

Transport Ascend NPU tensors via HCCS

import ray
from ray_ascend.direct_transport import HCCLTensorTransport
from ray.experimental import register_tensor_transport
register_tensor_transport("HCCL", ["npu"], HCCLTensorTransport)

@ray.remote
class RayActor:
    @ray.method(tensor_transport="HCCL")
    def transfer_npu_tensor_via_hccs():
        return torch.zeros(1024, device="npu")

sender = RayActor.remote()
npu_tensor = ray.get(sender.transfer_npu_tensor_via_hccs())

Transport Ascend NPU tensors via HCCS and CPU tensors via RDMA

openYuanrong-datasystem (YR) allows users to transport NPU tensors (via HCCS) and CPU tensors (via RDMA if provided) by ray objects.

import ray
from ray_ascend.direct_transport import YRTensorTransport
from ray.experimental import register_tensor_transport
register_tensor_transport("YR", ["npu", "cpu"], YRTensorTransport)

@ray.remote
class RayActor:
    @ray.method(tensor_transport="YR")
    def transfer_npu_tensor_via_hccs():
        return torch.zeros(1024, device="npu")

    @ray.method(tensor_transport="YR")
    def transfer_cpu_tensor_via_rdma():
        return torch.zeros(1024)

sender = RayActor.remote()
npu_tensor = ray.get(sender.transfer_npu_tensor_via_hccs())
cpu_tensor = ray.get(sender.transfer_cpu_tensor_via_rdma())

Contributing

See CONTRIBUTING for more details, which is a step-by-step guide to help you set up development environment, build and test. Please let us know if you find a bug or request a feature by filing an issue.

License

Apache License 2.0. See LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
ray_ascend		ray_ascend
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ray Ascend Plugin

Overview

Prerequisites

Version

Quick start

Installation

HCCL collective communication among ray actors

Transport Ascend NPU tensors via HCCS

Transport Ascend NPU tensors via HCCS and CPU tensors via RDMA

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

KaisennHu/ray-ascend

Folders and files

Latest commit

History

Repository files navigation

Ray Ascend Plugin

Overview

Prerequisites

Version

Quick start

Installation

HCCL collective communication among ray actors

Transport Ascend NPU tensors via HCCS

Transport Ascend NPU tensors via HCCS and CPU tensors via RDMA

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages