Skip to content

Add environment DeviceMerge #7969

Open
gonidelis wants to merge 4 commits intoNVIDIA:mainfrom
gonidelis:merge_env
Open

Add environment DeviceMerge #7969
gonidelis wants to merge 4 commits intoNVIDIA:mainfrom
gonidelis:merge_env

Conversation

@gonidelis
Copy link
Member

@gonidelis gonidelis commented Mar 10, 2026

fixes #7543

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 10, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 10, 2026
@gonidelis gonidelis changed the title Merge env Add environment DeviceMerge Mar 10, 2026
@gonidelis gonidelis marked this pull request as ready for review March 10, 2026 07:26
@gonidelis gonidelis requested a review from a team as a code owner March 10, 2026 07:26
@gonidelis gonidelis requested a review from elstehle March 10, 2026 07:26
@gonidelis gonidelis enabled auto-merge (squash) March 10, 2026 07:26
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 10, 2026
@github-actions

This comment has been minimized.

Comment on lines +162 to +164
::cuda::std::enable_if_t<!::cuda::std::is_same_v<KeyIteratorIn1, void*>
&& !::cuda::std::is_same_v<KeyIteratorIn1, ::cuda::std::nullptr_t>,
int> = 0>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remark: that's an interesting constraint since it guards against using the old overload with nullptr. It's fine. You need it here I guess because you cannot constraint a second template argument since it's just int64 here and a size_t from the other overload would convert to int64 well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still get an ambiguity:

cub::DeviceMerge::MergeKeys(
    thrust::raw_pointer_cast(temp_storage.data()),  // KeyIteratorIn1 = char*
    temp_storage_bytes,                              // num_keys1 (size_t -> int64)
    keys1.begin(),                                   // KeyIteratorIn2
    static_cast<int>(keys1.size()),                  // num_keys2 (int -> int64)
    keys2.begin(),                                   // KeyIteratorOut
    static_cast<int>(keys2.size()),                  // CompareOp = int
    result.begin());                                 // EnvT = iterator

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need extra constraints

@gonidelis
Copy link
Member Author

After long discussion with @bernhardmgruber we decided not to specify any deterministic guarantees for DeviceMerge


// %PARAM% TEST_LAUNCH lid 0:1:2

#include <cuda/__execution/require.h>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might not be needed

@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 2h 20m: Pass: 99%/249 | Total: 8d 17h | Max: 2h 19m | Hits: 73%/157066

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Add env-based API for cub::DeviceMerge

2 participants