Skip to content

Commit db7eac4

Browse files
EwanCRossBrunton
andauthored
[SYCL][Graph][UR] Propagate graph update list to UR (intel#17019)
Update the `urCommandBufferUpdateKernelLaunchExp` API for updating commands in a command-buffer to take a list of commands. The current API operates on a single command, this means that the SYCL-Graph `update(std::vector<nodes>)` API needs to serialize the list into N calls to the UR API. Given that both OpenCL `clUpdateMutableCommandsKHR` and Level-Zero `zeCommandListUpdateMutableCommandsExp` can operate on a list of commands, this serialization at the UR layer of the stack introduces extra host API calls. This PR improves the `urCommandBufferUpdateKernelLaunchExp` API so that a list of commands is passed all the way from SYCL to the native backend API. As highlighted in oneapi-src/unified-runtime#2671 this patch requires the handle translation auto generated code to be able to handle a list of structs, which is not currently the case. This is PR includes a API specific temporary workaround in the mako file which will unblock this PR until a more permanent solution is completed. --------- Co-authored-by: Ross Brunton <[email protected]>
1 parent b0a521b commit db7eac4

33 files changed

+1606
-1050
lines changed

sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc

+4
Original file line numberDiff line numberDiff line change
@@ -1431,6 +1431,7 @@ Exceptions:
14311431
created.
14321432
* Throws with error code `invalid` if `node` is not part of the
14331433
graph.
1434+
* If any other exception is thrown the state of the graph node is undefined.
14341435

14351436
|
14361437
[source,c++]
@@ -1465,6 +1466,7 @@ Exceptions:
14651466
`property::graph::updatable` was not set when the executable graph was created.
14661467
* Throws with error code `invalid` if any node in `nodes` is not part of the
14671468
graph.
1469+
* If any other exception is thrown the state of the graph nodes is undefined.
14681470

14691471
|
14701472
[source, c++]
@@ -1517,6 +1519,8 @@ Exceptions:
15171519
* Throws synchronously with error code `invalid` if
15181520
`property::graph::updatable` was not set when the executable graph was
15191521
created.
1522+
1523+
* If any other exception is thrown the state of the graph nodes is undefined.
15201524
|===
15211525

15221526
Table {counter: tableNumber}. Member functions of the `command_graph` class for

sycl/source/detail/graph_impl.cpp

+202-119
Large diffs are not rendered by default.

sycl/source/detail/graph_impl.hpp

+52-1
Original file line numberDiff line numberDiff line change
@@ -1303,7 +1303,30 @@ class exec_graph_impl {
13031303
void update(std::shared_ptr<node_impl> Node);
13041304
void update(const std::vector<std::shared_ptr<node_impl>> &Nodes);
13051305

1306-
void updateImpl(std::shared_ptr<node_impl> NodeImpl);
1306+
/// Calls UR entry-point to update nodes in command-buffer.
1307+
/// @param CommandBuffer The UR command-buffer to update commands in.
1308+
/// @param Nodes List of nodes to update. Only nodes which can be updated
1309+
/// through UR should be included in this list, currently this is only
1310+
/// nodes of kernel type.
1311+
void updateURImpl(ur_exp_command_buffer_handle_t CommandBuffer,
1312+
const std::vector<std::shared_ptr<node_impl>> &Nodes) const;
1313+
1314+
/// Update host-task nodes
1315+
/// @param Nodes List of nodes to update, any node that is not a host-task
1316+
/// will be ignored.
1317+
void updateHostTasksImpl(
1318+
const std::vector<std::shared_ptr<node_impl>> &Nodes) const;
1319+
1320+
/// Splits a list of nodes into separate lists of nodes for each
1321+
/// command-buffer partition.
1322+
///
1323+
/// Only nodes that can be updated through the UR interface are included
1324+
/// in the list. Currently this is only kernel node types.
1325+
///
1326+
/// @param Nodes List of nodes to split
1327+
/// @return Map of partition indexes to nodes
1328+
std::map<int, std::vector<std::shared_ptr<node_impl>>> getURUpdatableNodes(
1329+
const std::vector<std::shared_ptr<node_impl>> &Nodes) const;
13071330

13081331
unsigned long long getID() const { return MID; }
13091332

@@ -1373,6 +1396,34 @@ class exec_graph_impl {
13731396
Stream.close();
13741397
}
13751398

1399+
/// Determines if scheduler needs to be used for node update.
1400+
/// @param[in] Nodes List of nodes to be updated
1401+
/// @param[out] UpdateRequirements Accessor requirements found in /p Nodes.
1402+
/// return True if update should be done through the scheduler.
1403+
bool needsScheduledUpdate(
1404+
const std::vector<std::shared_ptr<node_impl>> &Nodes,
1405+
std::vector<sycl::detail::AccessorImplHost *> &UpdateRequirements);
1406+
1407+
/// Sets the UR struct values required to update a graph node.
1408+
/// @param[in] Node The node to be updated.
1409+
/// @param[out] BundleObjs UR objects created from kernel bundle.
1410+
/// Responsibility of the caller to release.
1411+
/// @param[out] MemobjDescs Memory object arguments to update.
1412+
/// @param[out] MemobjProps Properties used in /p MemobjDescs structs.
1413+
/// @param[out] PtrDescs Pointer arguments to update.
1414+
/// @param[out] ValueDescs Value arguments to update.
1415+
/// @param[out] NDRDesc ND-Range to update.
1416+
/// @param[out] UpdateDesc Base struct in the pointer chain.
1417+
void populateURKernelUpdateStructs(
1418+
const std::shared_ptr<node_impl> &Node,
1419+
std::pair<ur_program_handle_t, ur_kernel_handle_t> &BundleObjs,
1420+
std::vector<ur_exp_command_buffer_update_memobj_arg_desc_t> &MemobjDescs,
1421+
std::vector<ur_kernel_arg_mem_obj_properties_t> &MemobjProps,
1422+
std::vector<ur_exp_command_buffer_update_pointer_arg_desc_t> &PtrDescs,
1423+
std::vector<ur_exp_command_buffer_update_value_arg_desc_t> &ValueDescs,
1424+
sycl::detail::NDRDescT &NDRDesc,
1425+
ur_exp_command_buffer_update_kernel_launch_desc_t &UpdateDesc) const;
1426+
13761427
/// Execution schedule of nodes in the graph.
13771428
std::list<std::shared_ptr<node_impl>> MSchedule;
13781429
/// Pointer to the modifiable graph impl associated with this executable

sycl/source/detail/scheduler/commands.cpp

+11-1
Original file line numberDiff line numberDiff line change
@@ -3707,7 +3707,17 @@ ur_result_t UpdateCommandBufferCommand::enqueueImp() {
37073707
default:
37083708
break;
37093709
}
3710-
MGraph->updateImpl(Node);
3710+
}
3711+
3712+
// Split list of nodes into nodes per UR command-buffer partition, then
3713+
// call UR update on each command-buffer partition with those updatable
3714+
// nodes.
3715+
auto PartitionedNodes = MGraph->getURUpdatableNodes(MNodes);
3716+
auto Device = MQueue->get_device();
3717+
auto &Partitions = MGraph->getPartitions();
3718+
for (auto &[PartitionIndex, NodeImpl] : PartitionedNodes) {
3719+
auto CommandBuffer = Partitions[PartitionIndex]->MCommandBuffers[Device];
3720+
MGraph->updateURImpl(CommandBuffer, NodeImpl);
37113721
}
37123722

37133723
return UR_RESULT_SUCCESS;

unified-runtime/include/ur_api.h

+63-50
Original file line numberDiff line numberDiff line change
@@ -10446,6 +10446,21 @@ typedef struct ur_exp_command_buffer_desc_t {
1044610446

1044710447
} ur_exp_command_buffer_desc_t;
1044810448

10449+
///////////////////////////////////////////////////////////////////////////////
10450+
/// @brief A value that identifies a command inside of a command-buffer, used
10451+
/// for
10452+
/// defining dependencies between commands in the same command-buffer.
10453+
typedef uint32_t ur_exp_command_buffer_sync_point_t;
10454+
10455+
///////////////////////////////////////////////////////////////////////////////
10456+
/// @brief Handle of Command-Buffer object
10457+
typedef struct ur_exp_command_buffer_handle_t_ *ur_exp_command_buffer_handle_t;
10458+
10459+
///////////////////////////////////////////////////////////////////////////////
10460+
/// @brief Handle of a Command-Buffer command
10461+
typedef struct ur_exp_command_buffer_command_handle_t_
10462+
*ur_exp_command_buffer_command_handle_t;
10463+
1044910464
///////////////////////////////////////////////////////////////////////////////
1045010465
/// @brief Descriptor type for updating a kernel command memobj argument.
1045110466
typedef struct ur_exp_command_buffer_update_memobj_arg_desc_t {
@@ -10509,6 +10524,8 @@ typedef struct ur_exp_command_buffer_update_kernel_launch_desc_t {
1050910524
ur_structure_type_t stype;
1051010525
/// [in][optional] pointer to extension-specific structure
1051110526
const void *pNext;
10527+
/// [in] Handle of the command-buffer kernel command to update.
10528+
ur_exp_command_buffer_command_handle_t hCommand;
1051210529
/// [in][optional] The new kernel handle. If this parameter is nullptr,
1051310530
/// the current kernel handle in `hCommand`
1051410531
/// will be used. If a kernel handle is passed, it must be a valid kernel
@@ -10558,21 +10575,6 @@ typedef struct ur_exp_command_buffer_update_kernel_launch_desc_t {
1055810575

1055910576
} ur_exp_command_buffer_update_kernel_launch_desc_t;
1056010577

10561-
///////////////////////////////////////////////////////////////////////////////
10562-
/// @brief A value that identifies a command inside of a command-buffer, used
10563-
/// for
10564-
/// defining dependencies between commands in the same command-buffer.
10565-
typedef uint32_t ur_exp_command_buffer_sync_point_t;
10566-
10567-
///////////////////////////////////////////////////////////////////////////////
10568-
/// @brief Handle of Command-Buffer object
10569-
typedef struct ur_exp_command_buffer_handle_t_ *ur_exp_command_buffer_handle_t;
10570-
10571-
///////////////////////////////////////////////////////////////////////////////
10572-
/// @brief Handle of a Command-Buffer command
10573-
typedef struct ur_exp_command_buffer_command_handle_t_
10574-
*ur_exp_command_buffer_command_handle_t;
10575-
1057610578
///////////////////////////////////////////////////////////////////////////////
1057710579
/// @brief Create a Command-Buffer object
1057810580
///
@@ -11520,74 +11522,84 @@ UR_APIEXPORT ur_result_t UR_APICALL urCommandBufferEnqueueExp(
1152011522
///
1152111523
/// @details
1152211524
/// This entry-point is synchronous and may block if the command-buffer is
11523-
/// executing when the entry-point is called.
11525+
/// executing when the entry-point is called. On error, the state of the
11526+
/// command-buffer commands being updated is undefined.
1152411527
///
1152511528
/// @returns
1152611529
/// - ::UR_RESULT_SUCCESS
1152711530
/// - ::UR_RESULT_ERROR_UNINITIALIZED
1152811531
/// - ::UR_RESULT_ERROR_DEVICE_LOST
1152911532
/// - ::UR_RESULT_ERROR_ADAPTER_SPECIFIC
1153011533
/// - ::UR_RESULT_ERROR_INVALID_NULL_HANDLE
11531-
/// + `NULL == hCommand`
11534+
/// + `NULL == hCommandBuffer`
11535+
/// + `NULL == pUpdateKernelLaunch->hCommand`
1153211536
/// - ::UR_RESULT_ERROR_INVALID_NULL_POINTER
1153311537
/// + `NULL == pUpdateKernelLaunch`
11538+
/// - ::UR_RESULT_ERROR_INVALID_COMMAND_BUFFER_EXP
11539+
/// - ::UR_RESULT_ERROR_INVALID_SIZE
11540+
/// + `numKernelUpdates == 0`
1153411541
/// - ::UR_RESULT_ERROR_UNSUPPORTED_FEATURE
1153511542
/// + If
1153611543
/// ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_KERNEL_ARGUMENTS
11537-
/// is not supported by the device, but any of
11538-
/// `pUpdateKernelLaunch->numNewMemObjArgs`,
11539-
/// `pUpdateKernelLaunch->numNewPointerArgs`, or
11540-
/// `pUpdateKernelLaunch->numNewValueArgs` are not zero.
11544+
/// is not supported by the device, and for any of any element of
11545+
/// `pUpdateKernelLaunch` the `numNewMemObjArgs`, `numNewPointerArgs`,
11546+
/// or `numNewValueArgs` members are not zero.
1154111547
/// + If
1154211548
/// ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_LOCAL_WORK_SIZE is
11543-
/// not supported by the device but
11544-
/// `pUpdateKernelLaunch->pNewLocalWorkSize` is not nullptr.
11549+
/// not supported by the device, and for any element of
11550+
/// `pUpdateKernelLaunch` the `pNewLocalWorkSize` member is not nullptr.
1154511551
/// + If
1154611552
/// ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_LOCAL_WORK_SIZE is
11547-
/// not supported by the device but
11548-
/// `pUpdateKernelLaunch->pNewLocalWorkSize` is nullptr and
11549-
/// `pUpdateKernelLaunch->pNewGlobalWorkSize` is not nullptr.
11553+
/// not supported by the device, and for any element of
11554+
/// `pUpdateKernelLaunch` the `pNewLocalWorkSize` member is nullptr and
11555+
/// `pNewGlobalWorkSize` is not nullptr.
1155011556
/// + If
1155111557
/// ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_GLOBAL_WORK_SIZE
11552-
/// is not supported by the device but
11553-
/// `pUpdateKernelLaunch->pNewGlobalWorkSize` is not nullptr
11558+
/// is not supported by the device, and for any element of
11559+
/// `pUpdateKernelLaunch` the `pNewGlobalWorkSize` member is not nullptr
1155411560
/// + If
1155511561
/// ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_GLOBAL_WORK_OFFSET
11556-
/// is not supported by the device but
11557-
/// `pUpdateKernelLaunch->pNewGlobalWorkOffset` is not nullptr.
11562+
/// is not supported by the device, and for any element of
11563+
/// `pUpdateKernelLaunch` the `pNewGlobalWorkOffset` member is not
11564+
/// nullptr.
1155811565
/// + If ::UR_DEVICE_COMMAND_BUFFER_UPDATE_CAPABILITY_FLAG_KERNEL_HANDLE
11559-
/// is not supported by the device but `pUpdateKernelLaunch->hNewKernel`
11560-
/// is not nullptr.
11566+
/// is not supported by the device, and for any element of
11567+
/// `pUpdateKernelLaunch` the `hNewKernel` member is not nullptr.
1156111568
/// - ::UR_RESULT_ERROR_INVALID_OPERATION
1156211569
/// + If ::ur_exp_command_buffer_desc_t::isUpdatable was not set to true
11563-
/// on creation of the command-buffer `hCommand` belongs to.
11564-
/// + If the command-buffer `hCommand` belongs to has not been
11565-
/// finalized.
11570+
/// on creation of the `hCommandBuffer`.
11571+
/// + If `hCommandBuffer` has not been finalized.
1156611572
/// - ::UR_RESULT_ERROR_INVALID_COMMAND_BUFFER_COMMAND_HANDLE_EXP
11567-
/// + If `hCommand` is not a kernel execution command.
11573+
/// + If for any element of `pUpdateKernelLaunch` the `hCommand` member
11574+
/// is not a kernel execution command.
11575+
/// + If for any element of `pUpdateKernelLaunch` the `hCommand` member
11576+
/// was not created from `hCommandBuffer`.
1156811577
/// - ::UR_RESULT_ERROR_INVALID_MEM_OBJECT
1156911578
/// - ::UR_RESULT_ERROR_INVALID_KERNEL_ARGUMENT_INDEX
1157011579
/// - ::UR_RESULT_ERROR_INVALID_KERNEL_ARGUMENT_SIZE
1157111580
/// - ::UR_RESULT_ERROR_INVALID_ENUMERATION
1157211581
/// - ::UR_RESULT_ERROR_INVALID_WORK_DIMENSION
11573-
/// + `pUpdateKernelLaunch->newWorkDim < 1 ||
11574-
/// pUpdateKernelLaunch->newWorkDim > 3`
11582+
/// + If for any element of `pUpdateKernelLaunch` the `newWorkDim`
11583+
/// member is less than 1 or greater than 3.
1157511584
/// - ::UR_RESULT_ERROR_INVALID_WORK_GROUP_SIZE
1157611585
/// - ::UR_RESULT_ERROR_INVALID_VALUE
11577-
/// + If `pUpdateKernelLaunch->hNewKernel` was not passed to the
11578-
/// `hKernel` or `phKernelAlternatives` parameters of
11579-
/// ::urCommandBufferAppendKernelLaunchExp when this command was
11580-
/// created.
11581-
/// + If `pUpdateKernelLaunch->newWorkDim` is different from the current
11582-
/// workDim in `hCommand` and,
11583-
/// `pUpdateKernelLaunch->pNewGlobalWorkSize`, or
11584-
/// `pUpdateKernelLaunch->pNewGlobalWorkOffset` are nullptr.
11586+
/// + If for any element of `pUpdateKernelLaunch` the `hNewKernel`
11587+
/// member was not passed to the `hKernel` or `phKernelAlternatives`
11588+
/// parameters of ::urCommandBufferAppendKernelLaunchExp when the
11589+
/// command was created.
11590+
/// + If for any element of `pUpdateKernelLaunch` the `newWorkDim`
11591+
/// member is different from the current workDim in the `hCommand`
11592+
/// member, and `pNewGlobalWorkSize` or `pNewGlobalWorkOffset` are
11593+
/// nullptr.
1158511594
/// - ::UR_RESULT_ERROR_OUT_OF_HOST_MEMORY
1158611595
/// - ::UR_RESULT_ERROR_OUT_OF_RESOURCES
1158711596
UR_APIEXPORT ur_result_t UR_APICALL urCommandBufferUpdateKernelLaunchExp(
11588-
/// [in] Handle of the command-buffer kernel command to update.
11589-
ur_exp_command_buffer_command_handle_t hCommand,
11590-
/// [in] Struct defining how the kernel command is to be updated.
11597+
/// [in] Handle of the command-buffer object.
11598+
ur_exp_command_buffer_handle_t hCommandBuffer,
11599+
/// [in] Length of pUpdateKernelLaunch.
11600+
uint32_t numKernelUpdates,
11601+
/// [in][range(0, numKernelUpdates)] List of structs defining how a
11602+
/// kernel commands are to be updated.
1159111603
const ur_exp_command_buffer_update_kernel_launch_desc_t
1159211604
*pUpdateKernelLaunch);
1159311605

@@ -14820,7 +14832,8 @@ typedef struct ur_command_buffer_enqueue_exp_params_t {
1482014832
/// @details Each entry is a pointer to the parameter passed to the function;
1482114833
/// allowing the callback the ability to modify the parameter's value
1482214834
typedef struct ur_command_buffer_update_kernel_launch_exp_params_t {
14823-
ur_exp_command_buffer_command_handle_t *phCommand;
14835+
ur_exp_command_buffer_handle_t *phCommandBuffer;
14836+
uint32_t *pnumKernelUpdates;
1482414837
const ur_exp_command_buffer_update_kernel_launch_desc_t *
1482514838
*ppUpdateKernelLaunch;
1482614839
} ur_command_buffer_update_kernel_launch_exp_params_t;

unified-runtime/include/ur_ddi.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -1682,7 +1682,7 @@ typedef ur_result_t(UR_APICALL *ur_pfnCommandBufferEnqueueExp_t)(
16821682
///////////////////////////////////////////////////////////////////////////////
16831683
/// @brief Function-pointer for urCommandBufferUpdateKernelLaunchExp
16841684
typedef ur_result_t(UR_APICALL *ur_pfnCommandBufferUpdateKernelLaunchExp_t)(
1685-
ur_exp_command_buffer_command_handle_t,
1685+
ur_exp_command_buffer_handle_t, uint32_t,
16861686
const ur_exp_command_buffer_update_kernel_launch_desc_t *);
16871687

16881688
///////////////////////////////////////////////////////////////////////////////

unified-runtime/include/ur_print.hpp

+24-3
Original file line numberDiff line numberDiff line change
@@ -11481,6 +11481,11 @@ inline std::ostream &operator<<(
1148111481

1148211482
ur::details::printStruct(os, (params.pNext));
1148311483

11484+
os << ", ";
11485+
os << ".hCommand = ";
11486+
11487+
ur::details::printPtr(os, (params.hCommand));
11488+
1148411489
os << ", ";
1148511490
os << ".hNewKernel = ";
1148611491

@@ -19432,14 +19437,30 @@ inline std::ostream &
1943219437
operator<<(std::ostream &os, [[maybe_unused]] const struct
1943319438
ur_command_buffer_update_kernel_launch_exp_params_t *params) {
1943419439

19435-
os << ".hCommand = ";
19440+
os << ".hCommandBuffer = ";
1943619441

19437-
ur::details::printPtr(os, *(params->phCommand));
19442+
ur::details::printPtr(os, *(params->phCommandBuffer));
19443+
19444+
os << ", ";
19445+
os << ".numKernelUpdates = ";
19446+
19447+
os << *(params->pnumKernelUpdates);
1943819448

1943919449
os << ", ";
1944019450
os << ".pUpdateKernelLaunch = ";
19451+
ur::details::printPtr(
19452+
os, reinterpret_cast<const void *>(*(params->ppUpdateKernelLaunch)));
19453+
if (*(params->ppUpdateKernelLaunch) != NULL) {
19454+
os << " {";
19455+
for (size_t i = 0; i < *params->pnumKernelUpdates; ++i) {
19456+
if (i != 0) {
19457+
os << ", ";
19458+
}
1944119459

19442-
ur::details::printPtr(os, *(params->ppUpdateKernelLaunch));
19460+
os << (*(params->ppUpdateKernelLaunch))[i];
19461+
}
19462+
os << "}";
19463+
}
1944319464

1944419465
return os;
1944519466
}

unified-runtime/scripts/core/EXP-COMMAND-BUFFER.rst

+5-2
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,8 @@ ${x}CommandBufferUpdateKernelLaunchExp.
309309
${x}_exp_command_buffer_update_kernel_launch_desc_t update {
310310
UR_STRUCTURE_TYPE_EXP_COMMAND_BUFFER_UPDATE_KERNEL_LAUNCH_DESC, // stype
311311
nullptr, // pNext
312-
hNewKernel // hNewKernel
312+
hCommand, // hCommand
313+
hNewKernel, // hNewKernel
313314
2, // numNewMemobjArgs
314315
0, // numNewPointerArgs
315316
0, // numNewValueArgs
@@ -325,7 +326,7 @@ ${x}CommandBufferUpdateKernelLaunchExp.
325326
};
326327
327328
// Perform the update
328-
${x}CommandBufferUpdateKernelLaunchExp(hCommand, &update);
329+
${x}CommandBufferUpdateKernelLaunchExp(hCommandBuffer, 1, &update);
329330
330331
Command Event Update
331332
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@@ -514,6 +515,8 @@ Changelog
514515
+-----------+-------------------------------------------------------+
515516
| 1.7 | Remove command handle reference counting and querying |
516517
+-----------+-------------------------------------------------------+
518+
| 1.8 | Change Kernel command update API to take a list |
519+
+-----------+-------------------------------------------------------+
517520

518521
Contributors
519522
--------------------------------------------------------------------------------

0 commit comments

Comments
 (0)