Skip to content

Commit c3a07a8

Browse files
committed
revised
1 parent a5a4356 commit c3a07a8

File tree

1 file changed

+14
-12
lines changed

1 file changed

+14
-12
lines changed

rfcs/20240806-c-api/README.md

+14-12
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,17 @@ communication libraries, while introducing a few changes, as described next:
3232
structures are hidden behind handles returned to the user, such as
3333
`ccl::stream` and `ccl::comm`.
3434

35-
2. The API is extended with two C++ API functions to support `sycl::queue`:
35+
2. The API is extended to support different types of streams or queues:
3636

37-
- `onecclResult_t onecclCreateStream(sycl::queue, &oneccl_stream)`
37+
- `onecclResult_t onecclCreateStreamXPU(onecclStream_t* oneccl_stream, void *args)`
38+
the args is a pointer to the stream or queue that is vendor specific.
39+
- `onecclResult_t onecclStreamCreateCPU(onecclStream_t* oneccl_stream, void* args)`
40+
this API is explicit for CPU.
3841
- `onecclResult_t onecclReleaseStream(oneccl_stream)`
3942

40-
Once the sycl::queue is registered, it is hidden behind the ccl stream
43+
`onecclResult_t onecclStreamDestroy(onecclStream_t oneccl_stream)`
44+
45+
Once the sycl::queue is registered, it is hidden behind the `cclStream_t`
4146
handle
4247

4348
3. Add functions to allow users to explicitly control the lifetime of objects,
@@ -67,18 +72,15 @@ API, and the current oneCCL API.
6772

6873
| NCCL | oneCCL (proposed C) | oneCCL (current, C++) |
6974
|-------------------|------------------------------|-------------------------|
70-
|`cudaError_t` |`onecclResult_t cudaSetDevice(device)(1)`| N/A |
7175
|`ncclResult_t ncclGetUniqueId (id)`| `onecclResult_t onecclGetUniqueId (id)`| `ccl::create_main_kvs(); ccl::create_kvs(main_addr);`|
72-
|`ncclResult_t ncclCommInitRank(comm, size, id, rank)`|`onecclResult_t onecclCommInitRank(comm, size, id, rank)`|`comm cl::create_communicator(size, rank, device, context, kvs) comms ccl:create_communicators(size, rank, device, context, kvs)`|
76+
|`ncclResult_t ncclCommInitRank(comm, size, id, rank)`|`onecclResult_t onecclCommInitRank(comm, size, id, rank)(1)`|`comm cl::create_communicator(size, rank, device, context, kvs) comms ccl:create_communicators(size, rank, device, context, kvs)`|
7377
|`ncclResult_t ncclCommInitRankConfig(comm, size, id, rank, attr)`|`onecclResult_t onecclCommInitRankConfig(comm, size, id, rank, attr)`|`comm ccl:create_communicator(size, rank, device, context, kvs, attr)`|
7478
|`ncclResult_t ncclCommInitAll (comms, ndev, dev_list)`|`onecclResult_t onecclCommInitAll(comms,ndev,dev_list)`| Not currently available.Working on adding support.|
7579
|`ncclCommSplit` | Not implemented | Not implemented |
7680
|`nccltResult ncclCommFinalize(comm)`|`onecclResult_t onecclCommFinalize(comm)`| N/A |
7781
|`ncclResult_t ncclCommDestroy(comm)`|`onecclResult_t onecclCommDestroy(comm)`| Destructor |
7882

79-
Notice that cudaSetDevice(device) is a CUDA call, not a NCCL call. If an
80-
equivalent call is available in SYCL (or calling language), the proposed
81-
onecclSetDevice(device) will not be needed.
83+
This assumes that each rank is associated with a device, which has been set before calling this function (ncclCommInitRank).
8284

8385
#### APIs related with Collective Communication operations
8486

@@ -120,7 +122,7 @@ communicator::allreduce(sendbuff, recvbuff, count, datatype, op, comm, oneccl_st
120122
|`ncclResult_t ncclCommCuDevice(comm, device)`|`onecclResult_t onecclCommGetDevice(comm, device)`|`device communicator::get_device()`|
121123
|`ncclResult_t ncclCommUserRank(comm, rank)`|`onecclResult_t onecclCommUserRank(comm, rank)`|`rank communicator::rank()`|
122124
|`ncclResult_t ncclGetVersion(version)`|`onecclResult_t onecclGetVersion(version)`|`version ccl:get_library_version()`|
123-
|`ncclCommAbort` | Not implemented | N/A |
124-
|`ncclCommGetAsyncError`| Not implemented | N/A |
125-
|`ncclGetLastError` | Not implemented | N/A |
126-
|`ncclGetErrorString`| Not implemented | N/A |
125+
|`ncclCommAbort` | `onecclCommAbort` | N/A |
126+
|`ncclCommGetAsyncError`| `onecclCommGetAsyncError` | N/A |
127+
|`ncclGetLastError` | `onecclGetLastError` | N/A |
128+
|`ncclGetErrorString`| `onecclGetErrorString` | N/A |

0 commit comments

Comments
 (0)