Skip to content

[oneDPL] Indirectly Device Accessible Iterator Customization Point and Public Trait #620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
d90682b
Initial draft
danhoeflinger Apr 3, 2025
24b879d
grammar, clarity
danhoeflinger Apr 3, 2025
ff940e8
formatting
danhoeflinger Apr 3, 2025
0c0804b
code indentation
danhoeflinger Apr 4, 2025
8cff3e4
testing indentation formatting
danhoeflinger Apr 4, 2025
a36a006
indentation of comments in code
danhoeflinger Apr 4, 2025
cb102cb
removing unnecessary const ref
danhoeflinger Apr 4, 2025
bc32ce2
more indentation changes
danhoeflinger Apr 4, 2025
e7ffb99
language improvements
danhoeflinger Apr 4, 2025
913d555
remove some repetition
danhoeflinger Apr 4, 2025
29e209f
fix underline
danhoeflinger Apr 4, 2025
180a500
improve example, shorten var names
danhoeflinger Apr 4, 2025
a852f0f
improve comment clarity
danhoeflinger Apr 4, 2025
41b9828
Adding link for SYCL
danhoeflinger Apr 4, 2025
96b1b45
Device accessible content instead of passed directly
danhoeflinger Apr 4, 2025
dd71fdf
more information about device policy iterator compatibility
danhoeflinger Apr 4, 2025
21b8d0f
language improvement
danhoeflinger Apr 4, 2025
7c76593
readjusting content
danhoeflinger Apr 4, 2025
250aed2
minor improvements
danhoeflinger Apr 4, 2025
696d925
Improve language
danhoeflinger Apr 4, 2025
b250e77
accept suggestion
danhoeflinger Apr 7, 2025
8928a06
removing legacy passed direcly description
danhoeflinger Apr 7, 2025
1fff43d
Adjusting structure, and some language improvements
danhoeflinger Apr 7, 2025
59c147a
remove "always"
danhoeflinger Apr 8, 2025
0fe3101
Address feedback, take suggestions
danhoeflinger Apr 8, 2025
1d44b35
adding template to the text
danhoeflinger Apr 8, 2025
a918a45
remove usm std::vector::iterators mention
danhoeflinger Apr 8, 2025
25edb0b
adjust code formatting
danhoeflinger Apr 8, 2025
b4aeaf0
Signed-off-by: Dan Hoeflinger <[email protected]>
danhoeflinger Apr 8, 2025
4967f43
are -> is
danhoeflinger Apr 9, 2025
583fd52
language adjustment for base characteristic;
danhoeflinger Apr 9, 2025
b8a6018
rename to "indirectly device accessible iterators"
danhoeflinger Apr 9, 2025
311f2cb
formatting
danhoeflinger Apr 9, 2025
8676878
Adding a section for other iterators
danhoeflinger Apr 10, 2025
ef67c3d
remove unnecessary implementation details
danhoeflinger Apr 11, 2025
0f64be0
trait<buffer wrapper> = true
danhoeflinger Apr 11, 2025
780f501
restricting permutation_iterator SourceIterator to indirectly device …
danhoeflinger Apr 11, 2025
a57c7d7
formatting
danhoeflinger Apr 11, 2025
6114c2c
language improvements
danhoeflinger Apr 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,11 @@ to run algorithms on a SYCL device. When an algorithm runs with ``device_policy`
it is capable of processing SYCL buffers (passed via ``oneapi::dpl::begin/end``),
data in the host memory and data in Unified Shared Memory (USM), including device USM.
Data placed in the host memory and USM can be passed to oneDPL algorithms
as pointers and random access iterators. The way to transfer data from the host memory
as pointers and random access iterators. oneDPL provides some :ref:`iterators <iterators>` that are
compatible with algorithms when using a ``device_policy``. Custom iterators may also be used, but users should
ensure that those iterators have a defined :ref:`"passed directly" customization point <iterators-passed-directly>`
to avoid unnecessary data movement. The iterators must also be SYCL device-copyable to be used with
algorithms utilizing a ``device_policy``. The way to transfer data from the host memory
to a device and back is unspecified; per-element data movement to/from a temporary storage
is a possible valid implementation.

Expand Down
108 changes: 108 additions & 0 deletions source/elements/oneDPL/source/parallel_api/iterators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
..
.. SPDX-License-Identifier: CC-BY-4.0

.. _iterators:

Iterators
---------

Expand Down Expand Up @@ -301,3 +303,109 @@ operation were applied to each of these iterators. The types ``T`` within the te

``make_zip_iterator`` constructs and returns an instance of ``zip_iterator``
using the set of source iterators provided.

.. _iterators-passed-directly:

Customization Point for "Passed Directly" Iterators
---------------------------------------------------

Iterator types can be "passed directly" to `SYCL`_ kernels when they can inherently be dereferenced on the device while
using an algorithm with a ``device_policy``. Examples of iterators which can be "passed directly" include SYCL USM
shared or device memory, or iterator types like ``counting_iterator`` or ``discard_iterator`` that do not require any
data to be copied to the device. An example of an iterator type that cannot be "passed directly" is a ``std::vector``
iterator, which requires the data to be copied to the device in some way prior to usage in a SYCL kernel within
algorithms used with a ``device_policy``.

oneDPL provides a mechanism to define whether custom iterator types can be "passed directly" to SYCL kernels. oneDPL
queries this information at compile time to determine how to handle the iterator type when passed to algorithms with a
``device_policy``. This is important because it allows oneDPL to avoid unnecessary data movement. This is achieved using
the ``is_passed_directly_in_onedpl_device_policies`` Argument-Dependent Lookup (ADL) customization point and the public
trait ``is_passed_directly_to_device[_v]``.

ADL Customization Point: ``is_passed_directly_in_onedpl_device_policies``
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

A free function ``is_passed_directly_in_onedpl_device_policies(IteratorT)`` may be defined, which accepts an argument
of type ``IteratorT`` and returns a type with the characteristics of ``std::true_type`` if ``IteratorT`` can be
"passed directly" to SYCL kernels, or alternatively returns a type with the characteristics of ``std::false_type``
otherwise. The function must be defined in one of the valid search locations for ADL lookup, which includes the
namespace of the definition of the iterator type ``IteratorT``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence sounds a bit awkward to me. ADL last letter already means lookup. Also, "valid search locations" is a sort of unusual. How about this:

Suggested change
otherwise. The function must be defined in one of the valid search locations for ADL lookup, which includes the
namespace of the definition of the iterator type ``IteratorT``.
otherwise. The function must be discoverable by ADL, which includes the
namespace of the definition of the iterator type ``IteratorT``.

You may leave this part, if you think it's useful: "... , which includes the
namespace of the definition of the iterator type IteratorT" but it also might be removed, I think. They choice is yours.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken, I removed the second part. It doesn't really make sense after the edit. Its more something for the documentation than the spec.


The function ``is_passed_directly_in_onedpl_device_policies`` may be used by oneDPL to determine whether the iterator
type can be "passed directly" to SYCL kernels by interrogating its return type at compile time only. It shall not be
called by oneDPL outside a ``decltype`` context to determine the return type. This means that overloads may be provided
as forward declarations only, without a body defined. ADL lookup is used to determine which function overload to use
according to the rules in the `C++ Standard`_. Therefore, derived iterator types without an overload for their exact
type will match their most specific base iterator type if such an overload exists.

The default implementation of ``is_passed_directly_in_onedpl_device_policies`` marks the following iterators as
"passed directly":
* Pointers (to handle USM pointers)
* Iterators with the ``using is_passed_directly = std::true_type`` trait
* Iterators to USM shared allocated ``std::vector``-s when the allocator type is knowable from the iterator type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ever a case?

Copy link
Contributor Author

@danhoeflinger danhoeflinger Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, with a good amount of certainty with some compilers, yes. However, its a good question if its worth mentioning in the specification, since it is not always knowable (and not 100% knowable I believe).
uxlfoundation/oneDPL#1438 (comment)
uxlfoundation/oneDPL#1438 (comment)

How it looks in the oneDPL's implementation currently:
https://github.com/uxlfoundation/oneDPL/blob/1625f6a2dcc981a537c65fdc3b40951cb63b7326/include/oneapi/dpl/pstl/hetero/dpcpp/sycl_iterator.h#L154-L179

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of that as of specification, not as of documentation.
If we want to require that std::vector with USM allocators is used without excessive copying, then it is a part of the spec. Otherwise, it can be mentioned as an implementation-specific or completely omitted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, oneDPL's implementation is relying upon standard library implementation details which are not part of the C++ standard library specification of std::vector::iterator to enable handling of usm allocator vector iterators when it can confidently guess that the allocator type is knowable from the iterator type. From purely the C++ standard specification, I don't think it is possible to definitively know anything about the allocator from the type of a std::vector::iterator.

In practice, it would be quite unlikely that a standard library implementation could / would have a std::vector::iterator type which "tricks" the current oneDPL implementation's detection of a knowable "USM shared allocator", but I do think you could technically contrive such an implementation if you really wanted to.

From a specification perspective, it doesn't belong, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the line for now, but I'm open to discussion of course.

Copy link
Contributor

@rarutyun rarutyun Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if we speak about Allocator we should write an iterator type slightly differently: std::vector<T, Allocator>::iterator.

As you can see Allocator is a part of specialized vector type, not an iterator type. I don't know a way in C++ do deduce a outer type by inner type: in this concrete example: deduce vector<T, Allocator> type from nested iterator type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we re-introduce something like this, we can describe it with std::vector<T,Allocator>::iterator, but I think its not a requirement that makes a lot of sense in the specification, because I don't can't be achieved for a generic implementation of the standard library with 100% certainty.

* ``std::reverse_iterator<IteratorT>`` when ``IteratorT`` is "passed directly"

oneDPL defines the "passed directly" behavior for its custom iterators as follows:
* ``counting_iterator`` and ``discard_iterator``: Always "passed directly".
* ``permutation_iterator``: "Passed directly" if both its source iterator and its index map are "passed directly".
* ``transform_iterator``: "Passed directly" if its source iterator is "passed directly".
* ``zip_iterator``: "Passed directly" if all base iterators are "passed directly".


Public Trait: ``is_passed_directly_to_device[_v]``
++++++++++++++++++++++++++++++++++++++++++++++++++

The public trait ``oneapi::dpl::is_passed_directly_to_device[_v]`` can be used to query whether an iterator type is
"passed directly" to SYCL kernels. The trait is defined in ``<oneapi/dpl/iterator>``.

``oneapi::dpl::is_passed_directly_to_device<T>`` evaluates to a type with the characteristics of ``std::true_type`` if
``T`` can be "passed directly" to SYCL kernels, otherwise it evaluates to a type with the characteristics of
``std::false_type``.

``oneapi::dpl::is_passed_directly_to_device_v<T>`` is a ``constexpr bool`` that evaluates to ``true`` if ``T`` can be
"passed directly" to SYCL kernels, otherwise it evaluates to ``false``.

Example
+++++++

.. code:: cpp

namespace usr
{
struct pass_dir_it
{
/* unspecified user definition of a "passed directly" iterator */
};

std::true_type
is_passed_directly_in_onedpl_device_policies(pass_dir_it);

struct no_pass_dir_it
{
/* unspecified user definition of a non "passed directly" iterator */
};

std::false_type
is_passed_directly_in_onedpl_device_policies(no_pass_dir_it);
}

static_assert(oneapi::dpl::is_passed_directly_to_device_v<usr::pass_dir_it> == true);
static_assert(oneapi::dpl::is_passed_directly_to_device_v<usr::no_pass_dir_it> == false);

// Example with base iterators and ADL overload as a hidden friend
template <typename It1, typename It2>
struct it_pair
{
It1 first;
It2 second;
friend auto is_passed_directly_in_onedpl_device_policies(it_pair) ->
std::conjunction<oneapi::dpl::is_passed_directly_to_device<It1>,
oneapi::dpl::is_passed_directly_to_device<It2>>;
};

static_assert(oneapi::dpl::is_passed_directly_to_device_v<it_pair<usr::pass_dir_it, usr::pass_dir_it>> == true);
static_assert(oneapi::dpl::is_passed_directly_to_device_v<it_pair<usr::pass_dir_it, usr::no_pass_dir_it>> == false);


.. _`C++ Standard`: https://isocpp.org/std/the-standard
.. _`SYCL`: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html