Skip to content

Add environment DeviceMerge #7969

Open
gonidelis wants to merge 2 commits intoNVIDIA:mainfrom
gonidelis:merge_env
Open

Add environment DeviceMerge #7969
gonidelis wants to merge 2 commits intoNVIDIA:mainfrom
gonidelis:merge_env

Conversation

@gonidelis
Copy link
Member

@gonidelis gonidelis commented Mar 10, 2026

fixes #7543

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 10, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 10, 2026
@gonidelis gonidelis changed the title Merge env Add environment DeviceMerge Mar 10, 2026
@gonidelis gonidelis marked this pull request as ready for review March 10, 2026 07:26
@gonidelis gonidelis requested a review from a team as a code owner March 10, 2026 07:26
@gonidelis gonidelis requested a review from elstehle March 10, 2026 07:26
@gonidelis gonidelis enabled auto-merge (squash) March 10, 2026 07:26
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 10, 2026
@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 2h 04m: Pass: 15%/249 | Total: 4d 12h | Max: 2h 03m | Hits: 88%/55533

See results here.

Comment on lines +162 to +164
::cuda::std::enable_if_t<!::cuda::std::is_same_v<KeyIteratorIn1, void*>
&& !::cuda::std::is_same_v<KeyIteratorIn1, ::cuda::std::nullptr_t>,
int> = 0>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remark: that's an interesting constraint since it guards against using the old overload with nullptr. It's fine. You need it here I guess because you cannot constraint a second template argument since it's just int64 here and a size_t from the other overload would convert to int64 well.

Comment on lines +178 to +183
using requested_determinism_t =
::cuda::std::execution::__query_result_or_t<requirements_t,
::cuda::execution::determinism::__get_determinism_t,
::cuda::execution::determinism::run_to_run_t>;
static_assert(!::cuda::std::is_same_v<requested_determinism_t, ::cuda::execution::determinism::gpu_to_gpu_t>,
"gpu_to_gpu determinism is not supported for unstable device merge");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: I need to think about this, since an unstable algorithm could still be deterministic (even across multiple GPUs). How did you conclude that the merge path implementation is run_to_run deterministic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Add env-based API for cub::DeviceMerge

2 participants