Skip to content

Add env based DeviceFor::* algorithms#7798

Merged
gonidelis merged 11 commits intoNVIDIA:mainfrom
gonidelis:for_env
Mar 9, 2026
Merged

Add env based DeviceFor::* algorithms#7798
gonidelis merged 11 commits intoNVIDIA:mainfrom
gonidelis:for_env

Conversation

@gonidelis
Copy link
Member

@gonidelis gonidelis commented Feb 26, 2026

Adds env based DeviceFor::*

closed #7541

There are not deterministic guarantees imposed for the DeviceFor::* algorithms.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 26, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-project-automation github-project-automation bot moved this to Todo in CCCL Feb 26, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Feb 26, 2026
@gonidelis gonidelis force-pushed the for_env branch 2 times, most recently from af9d8de to 59ab137 Compare February 26, 2026 02:13
@gonidelis gonidelis marked this pull request as ready for review February 26, 2026 20:39
@gonidelis gonidelis requested a review from a team as a code owner February 26, 2026 20:39
@gonidelis gonidelis requested a review from davebayer February 26, 2026 20:39
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Feb 26, 2026
@gonidelis
Copy link
Member Author

One thing that might kill backwards compatibility from unifying the env overloads with the old ones is that we tend to make new overloads [[nodiscard]]. Adding this to the "new" env overloads (which just substitute the old) might pop compile erros to users out of the blue. @bernhardmgruber @gevtushenko should I avoid doing the overloads in this pr nodiscard?

@gonidelis
Copy link
Member Author

todo: add docs about memory_resource (?)

@github-actions

This comment has been minimized.

@bernhardmgruber
Copy link
Contributor

One thing that might kill backwards compatibility from unifying the env overloads with the old ones is that we tend to make new overloads [[nodiscard]]. Adding this to the "new" env overloads (which just substitute the old) might pop compile erros to users out of the blue. @bernhardmgruber @gevtushenko should I avoid doing the overloads in this pr nodiscard?

Good point. Please omit the [[nodiscard]] in this case.

@github-actions

This comment has been minimized.

@gonidelis
Copy link
Member Author

Since we replaced the cudaStream_t stream parameter with EnvT env, we broke backwards compatibility and callers that relied on implicit conversion from nvbench::cuda_stream to cudaStream_t. The extents benchmark showcases that. I see three options:

a) We go back to only adding env overloads (not converting old ones) for DeviceFor
b) We check internally if passed 3rd argument (env) is nvbench::cuda_stream and do the conversion explicitly inside the body
c) We break, and ask for people that were relying on implicit conversion to explicitly .get_stream()

@bernhardmgruber
Copy link
Contributor

Since we replaced the cudaStream_t stream parameter with EnvT env, we broke backwards compatibility and callers that relied on implicit conversion from nvbench::cuda_stream to cudaStream_t. The extents benchmark showcases that.

That's interesting, because we hit the same case in cub::DeviceTransform with RMM's stream type. We solved this in #7266 and added tests in #7278. What we apparently missed are types that are convertible to cudaStream_t and are not copyable, because then you can't pass them to the environment parameter.

Let me ask @miscco

@bernhardmgruber
Copy link
Contributor

Let me ask @miscco

Started an internal Slack thread.

In the meantime, I came up with a fix and pushed a commit.

@github-actions

This comment has been minimized.

@bernhardmgruber
Copy link
Contributor

/ok to test 712a5f3

@github-actions

This comment has been minimized.

@bernhardmgruber
Copy link
Contributor

pre-commit.ci autofix

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 9, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor

/ok to test ceda482

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

🥳 CI Workflow Results

🟩 Finished in 3h 49m: Pass: 100%/249 | Total: 7d 02h | Max: 3h 48m | Hits: 84%/155556

See results here.

@gonidelis gonidelis merged commit d0befe5 into NVIDIA:main Mar 9, 2026
266 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 9, 2026
@@ -926,6 +1127,10 @@ public:
//! .. versionadded:: 2.4.0
//! First appears in CUDA Toolkit 12.5.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😡😡😡😡😡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Add env-based API for cub::DeviceFor

3 participants