@@ -32,12 +32,16 @@ communication libraries, while introducing a few changes, as described next:
32
32
structures are hidden behind handles returned to the user, such as
33
33
` ccl::stream ` and ` ccl::comm ` .
34
34
35
- 2 . The API is extended with two C++ API functions to support ` sycl::queue ` :
35
+ 2 . The API is extended to support different types of streams or queues :
36
36
37
- - ` onecclResult_t onecclCreateStream(sycl::queue, &oneccl_stream) `
38
- - ` onecclResult_t onecclReleaseStream(oneccl_stream) `
37
+ - ` onecclResult_t onecclCreateStreamXPU(onecclStream_t* oneccl_stream, void *args) `
38
+ the args is a pointer to the stream or queue that is vendor specific.
39
+ - ` onecclResult_t onecclStreamCreateCPU(onecclStream_t* oneccl_stream, void* args) `
40
+ this API is explicit for CPU.
39
41
40
- Once the sycl::queue is registered, it is hidden behind the ccl stream
42
+ - ` onecclResult_t onecclStreamDestroy(onecclStream_t oneccl_stream) `
43
+
44
+ Once the sycl::queue is registered, it is hidden behind the ` onecclStream_t `
41
45
handle
42
46
43
47
3 . Add functions to allow users to explicitly control the lifetime of objects,
@@ -67,18 +71,15 @@ API, and the current oneCCL API.
67
71
68
72
| NCCL | oneCCL (proposed C) | oneCCL (current, C++) |
69
73
| -------------------| ------------------------------| -------------------------|
70
- | ` cudaError_t ` | ` onecclResult_t cudaSetDevice(device)(1) ` | N/A |
71
74
| ` ncclResult_t ncclGetUniqueId (id) ` | ` onecclResult_t onecclGetUniqueId (id) ` | ` ccl::create_main_kvs(); ccl::create_kvs(main_addr); ` |
72
- | ` ncclResult_t ncclCommInitRank(comm, size, id, rank) ` | ` onecclResult_t onecclCommInitRank(comm, size, id, rank) ` | ` comm cl::create_communicator(size, rank, device, context, kvs) comms ccl:create_communicators(size, rank, device, context, kvs) ` |
75
+ | ` ncclResult_t ncclCommInitRank(comm, size, id, rank) ` | ` onecclResult_t onecclCommInitRank(comm, size, id, rank)(1) ` | ` comm cl::create_communicator(size, rank, device, context, kvs) comms ccl:create_communicators(size, rank, device, context, kvs) ` |
73
76
| ` ncclResult_t ncclCommInitRankConfig(comm, size, id, rank, attr) ` | ` onecclResult_t onecclCommInitRankConfig(comm, size, id, rank, attr) ` | ` comm ccl:create_communicator(size, rank, device, context, kvs, attr) ` |
74
77
| ` ncclResult_t ncclCommInitAll (comms, ndev, dev_list) ` | ` onecclResult_t onecclCommInitAll(comms,ndev,dev_list) ` | Not currently available.Working on adding support.|
75
78
| ` ncclCommSplit ` | Not implemented | Not implemented |
76
79
| ` nccltResult ncclCommFinalize(comm) ` | ` onecclResult_t onecclCommFinalize(comm) ` | N/A |
77
80
| ` ncclResult_t ncclCommDestroy(comm) ` | ` onecclResult_t onecclCommDestroy(comm) ` | Destructor |
78
81
79
- Notice that cudaSetDevice(device) is a CUDA call, not a NCCL call. If an
80
- equivalent call is available in SYCL (or calling language), the proposed
81
- onecclSetDevice(device) will not be needed.
82
+ This assumes that each rank is associated with a device, which has been set before calling this function (ncclCommInitRank).
82
83
83
84
#### APIs related with Collective Communication operations
84
85
@@ -120,7 +121,7 @@ communicator::allreduce(sendbuff, recvbuff, count, datatype, op, comm, oneccl_st
120
121
| ` ncclResult_t ncclCommCuDevice(comm, device) ` | ` onecclResult_t onecclCommGetDevice(comm, device) ` | ` device communicator::get_device() ` |
121
122
| ` ncclResult_t ncclCommUserRank(comm, rank) ` | ` onecclResult_t onecclCommUserRank(comm, rank) ` | ` rank communicator::rank() ` |
122
123
| ` ncclResult_t ncclGetVersion(version) ` | ` onecclResult_t onecclGetVersion(version) ` | ` version ccl:get_library_version() ` |
123
- | ` ncclCommAbort ` | Not implemented | N/A |
124
- | ` ncclCommGetAsyncError ` | Not implemented | N/A |
125
- | ` ncclGetLastError ` | Not implemented | N/A |
126
- | ` ncclGetErrorString ` | Not implemented | N/A |
124
+ | ` ncclCommAbort ` | ` onecclCommAbort ` | N/A |
125
+ | ` ncclCommGetAsyncError ` | ` onecclCommGetAsyncError ` | N/A |
126
+ | ` ncclGetLastError ` | ` onecclGetLastError ` | N/A |
127
+ | ` ncclGetErrorString ` | ` onecclGetErrorString ` | N/A |
0 commit comments