Skip to content

Added GPU interface for triangle counting algorithm#3620

Draft
olegkkruglov wants to merge 63 commits into
uxlfoundation:mainfrom
olegkkruglov:gpu-tc
Draft

Added GPU interface for triangle counting algorithm#3620
olegkkruglov wants to merge 63 commits into
uxlfoundation:mainfrom
olegkkruglov:gpu-tc

Conversation

@olegkkruglov
Copy link
Copy Markdown
Contributor

@olegkkruglov olegkkruglov commented May 6, 2026

Description

Contains content from #3482 and same CI problems
Triangle counting GPU algorithm

  • GPU kernel implementation using ordered-count method on CSR topology in device USM memory
  • GPU kernel dispatch infrastructure (select_kernel_dpc, vertex_ranking_ops dispatcher for device_csr_topology)
  • GPU test suite with 8 test cases validating local and global triangle counts
  • Bazel BUILD rules for GPU kernel compilation and test target

Device-aware graph type

  • Extended csr_topology with to_device method and queue-aware set_topology for device pointers
  • Extended undirected_adjacency_vector_graph_impl with to_device that transfers topology, vertex values, and edge values using sycl::get_pointer_type to skip redundant transfers
  • Added public to_device(queue) on undirected_adjacency_vector_graph

device_csr_topology utility

  • Standalone device-resident CSR topology type wrapping dal::array objects in device USM
  • Helper functions topology_to_device and topology_to_host for explicit transfers
  • Alternative input path for graph algorithms accepting device data directly

vertex_ranking dispatch for DPC++

  • Added vertex_ranking overload accepting device_csr_topology
  • Extended is_valid_graph trait (guarded by ONEDAL_DATA_PARALLEL)
  • select_kernel GPU specialization for data_parallel_policy

DPC++ example

  • Triangle counting batch example: reads CSV graph, transfers to device via graph.to_device(q), runs GPU kernel

Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Performance

  • I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
  • I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
  • I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

@olegkkruglov olegkkruglov added the new algorithm New algorithm or method in oneDAL label May 6, 2026
…consistency; update related functionality; fixed bug for SIGSEGV
- Removed the existing frontier_dpc.hpp file to streamline the codebase.
- Introduced new test files for advance operation, BFS, and basic frontier operations.
- Implemented comprehensive tests to validate the functionality of the frontier data structure.
- Enhanced the frontier class with additional methods for better performance and usability.
- Ensured compatibility with SYCL and improved device memory management.
…ble declarations in BitmapKernel and frontier_dpc implementations
…sentation and update test case names for clarity
…t32_t> for improved performance and memory efficiency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new algorithm New algorithm or method in oneDAL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants