Added GPU interface for triangle counting algorithm by olegkkruglov · Pull Request #3620 · uxlfoundation/oneDAL

olegkkruglov · 2026-05-06T13:32:32Z

Description

Contains content from #3482 and same CI problems
Triangle counting GPU algorithm

GPU kernel implementation using ordered-count method on CSR topology in device USM memory
GPU kernel dispatch infrastructure (select_kernel_dpc, vertex_ranking_ops dispatcher for device_csr_topology)
GPU test suite with 8 test cases validating local and global triangle counts
Bazel BUILD rules for GPU kernel compilation and test target

Device-aware graph type

Extended csr_topology with to_device method and queue-aware set_topology for device pointers
Extended undirected_adjacency_vector_graph_impl with to_device that transfers topology, vertex values, and edge values using sycl::get_pointer_type to skip redundant transfers
Added public to_device(queue) on undirected_adjacency_vector_graph

device_csr_topology utility

Standalone device-resident CSR topology type wrapping dal::array objects in device USM
Helper functions topology_to_device and topology_to_host for explicit transfers
Alternative input path for graph algorithms accepting device data directly

vertex_ranking dispatch for DPC++

Added vertex_ranking overload accepting device_csr_topology
Extended is_valid_graph trait (guarded by ONEDAL_DATA_PARALLEL)
select_kernel GPU specialization for data_parallel_policy

DPC++ example

Triangle counting batch example: reads CSV graph, transfers to device via graph.to_device(q), runs GPU kernel

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

…move obsolete test header

…eck and clear operations

…consistency; update related functionality; fixed bug for SIGSEGV

…name printing and frontier checks

- Removed the existing frontier_dpc.hpp file to streamline the codebase. - Introduced new test files for advance operation, BFS, and basic frontier operations. - Implemented comprehensive tests to validate the functionality of the frontier data structure. - Enhanced the frontier class with additional methods for better performance and usability. - Ensured compatibility with SYCL and improved device memory management.

…ize global size calculation

…ter.

…al and add explanatory comment)

…headers, sources and tests

…ate and introduce hierarchical reductions

…ame test file

…ble declarations in BitmapKernel and frontier_dpc implementations

…sentation and update test case names for clarity

…t32_t> for improved performance and memory efficiency

…nd loops

…tting

…zation

olegkkruglov added the new algorithm New algorithm or method in oneDAL label May 6, 2026

olegkkruglov force-pushed the gpu-tc branch from 65b4ce2 to d00a403 Compare May 11, 2026 14:50

antonio-decaro added 28 commits May 12, 2026 04:04

feat: add bitset and frontier classes with atomic operations and tests

776c9df

feat: enhance frontier class with additional operations and tests; re…

cb957e8

…move obsolete test header

feat: enhance frontier class with additional methods and tests for ch…

10d8a3d

…eck and clear operations

insert compunte active kernel

0515247

add compute active frontier operation

c4149fe

bug fix

7a12f2c

refactor: rename clear and gas methods to unset and atomic_unset for …

74459af

…consistency; update related functionality; fixed bug for SIGSEGV

feat: add advance operation tests for frontier class; include device …

b141128

…name printing and frontier checks

fixed advance operator

40eea0b

implemented async advance

80835a6

add documentation for frontier

944d754

add performance tests for bfs

27a71b0

Refactor advance function to remove expected_size parameter and optim…

9dcba2e

…ize global size calculation

moved bitset to frontier folder

fcc4f75

applied clang-format

1e40aa9

added trailing new line to source files to complie with the CI format…

d471d0a

…ter.

update headers to comply with the oneDAL naming style

f720a87

primitives/frontier: remove unused dependencies from BUILD

10813a6

primitives/frontier: small code cleanup in advance.hpp (comment remov…

4da1b4c

…al and add explanatory comment)

primitives/frontier: remove unused includes and stray blank lines in …

923f2fa

…headers, sources and tests

Remove perf_tests target from frontier BUILD

7af91f5

Refactor frontier context and bitmap kernel: rename Context/ContextSt…

6356198

…ate and introduce hierarchical reductions

Remove performance testing scaffolding from frontier BFS test and ren…

2efbe84

…ame test file

Enhance code readability by adding comments and using const for varia…

2b7b2c4

…ble declarations in BitmapKernel and frontier_dpc implementations

Add copyright notices to header files in the frontier module

5dda7de

Refactor frontier test files: change vector to set for frontier repre…

f2641e6

…sentation and update test case names for clarity

Refactor compute_next_frontier to use vector<bool> instead of set<uin…

06265d4

…t32_t> for improved performance and memory efficiency

antonio-decaro and others added 22 commits May 12, 2026 04:04

tests: adapt frontier and BFS tests to use std::uint64_t for counts a…

12c4997

…nd loops

frontier: add element count to bitset constructor and store size

a85aa7e

frontier: propagate sizes and add docs and swap helper

838e25a

frontier: remove unnecessary bitset include from graph header

5a7ee62

frontier/advance: rename kernel, tighten types, and polish docs/forma…

5088c49

…tting

refactor: improve bitset layer size calculation for frontier initiali…

daf56a1

…zation

clang-format fix

53b9f79

Change license yaml to be able to add possible additional copyright

e7feaec

fix?

dc4f81d

fix?

862feda

fix?

8e758c9

fix?

a7806e2

fix?

b40d6db

fix?

d0de4ec

fix?

ca9b27e

fix?

f6c80b0

fix?

6797388

test

649f16f

fix

3859a98

Make license checker strict

22378f3

Add preview namespace

a654178

check

8947a09

olegkkruglov force-pushed the gpu-tc branch from d00a403 to 4fa47f2 Compare May 12, 2026 11:58

check

7b6c449

olegkkruglov force-pushed the gpu-tc branch from 4fa47f2 to 95fe2ce Compare May 12, 2026 18:11

Kruglov, Oleg added 3 commits May 12, 2026 11:11

Fix dbg

db05f45

tc gpu

87c706a

CI fix

95fe2ce

This was referenced May 29, 2026

Added GPU interface for connected components algorithm #3621

Draft

Added GPU interfaces for shortest paths algorithm #3622

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added GPU interface for triangle counting algorithm#3620

Added GPU interface for triangle counting algorithm#3620
olegkkruglov wants to merge 63 commits into
uxlfoundation:mainfrom
olegkkruglov:gpu-tc

olegkkruglov commented May 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

olegkkruglov commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

olegkkruglov commented May 6, 2026 •

edited

Loading