Skip to content

Commit 0a17b53

Browse files
committed
Introduced device::update_tip_length_buffer.
While modifying cuda::kernels::update_tip_length_buffer to make use of that common function, and updating device::build_tracks to be able to collaborate with the slightly different data that update_tip_length_buffer is now producing.
1 parent b93f454 commit 0a17b53

File tree

10 files changed

+205
-76
lines changed

10 files changed

+205
-76
lines changed

.github/copilot-instructions.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
This project is a heterogeneous demonstrator for tracking algorithms (CPU, CUDA,
2+
SYCL, Alpaka, Kokkos, Futhark).
3+
4+
Quick context for an AI coding agent
5+
- Big picture: `core/` contains the event-data-model, code shared by all
6+
architectures, and the host algorithms. Device-specific implementations
7+
live under `device/` in subfolders `cuda/`, `sycl/`, `alpaka/`, etc.
8+
Examples and executables are built under `examples/` and top-level CMake
9+
presets produce binaries in the build's `bin/` directory.
10+
- Primary build system: CMake (C++20). See `CMakeLists.txt`, `CMakePresets.json`
11+
and `cmake/` for presets and helper macros. Important per-module
12+
CMake: `core/CMakeLists.txt`, `device/*/CMakeLists.txt`, `extern/` for
13+
external dependencies.
14+
15+
What to change and where (concrete patterns)
16+
- New device specific algorithms are to be put into `device/<backend>/src/` with
17+
headers in `device/<backend>/include/traccc/<backend>/`. Follow existing
18+
patterns for kernel launches and host-device separation. See
19+
`device/cuda/src/clusterization/` for examples.
20+
21+
Build & test workflows (exact commands)
22+
- Common build (from repo root):
23+
cmake --preset <preset> -S traccc/ -B build
24+
cmake --build build/
25+
Common presets: `host-fp32`, `cuda-fp32`, `sycl-fp32`, `alpaka-fp32` (see
26+
`README.md`, build presets and `CMakePresets.json`).
27+
- Run unit tests (if built): use CTest from the build directory:
28+
`ctest --output-on-failure -C <config>`. The project uses GoogleTest;
29+
tests are controlled by `TRACCC_BUILD_TESTING`.
30+
31+
Project-specific conventions and gotchas
32+
- C++ standard: project requires C++20 for host and device code. Many CMake
33+
checks will fail early if not satisfied. See `CMakeLists.txt` for explicit
34+
checks.
35+
- Templated inline code: `.ipp` files contain template implementations. Keep
36+
header/impl separation consistent to avoid ODR/link errors.
37+
- VecMem/Eigen/Thrust/Acts and other dependencies are vendored under `extern/`
38+
but can be toggled with `TRACCC_USE_SYSTEM_*` flags. Prefer following the repo
39+
convention: use vendored versions unless the developer wants to link
40+
system/Spack libs.
41+
42+
Where to look for architecture decisions and data flow
43+
- High-level diagram and narratives: `README.md` (top-level) includes a mermaid
44+
dataflow graph linking clusterization -> measurements -> spacepoints -> seeding -> finding -> fitting.
45+
- Core algorithm grouping and examples: `core/CMakeLists.txt` lists the major
46+
components and files that implement clusterization, seeding, finding, and
47+
fitting — good entry points when adding or modifying an algorithm.
48+
- Device backends: `device/` mirrors core algorithms for each backend. Use these
49+
folders to understand how host/device separation is organized.
50+
51+
Testing, benchmarking and performance
52+
- Examples for running simulated data are in the README
53+
(e.g. `traccc_simulate_telescope` and `traccc_throughput_mt`).
54+
- For profiling builds, the repo includes support for collecting build profiling
55+
logs when CTest launchers are enabled (see `CMakeLists.txt` section that
56+
configures `traccc-ctest.sh`).
57+
58+
Editing guidance for AI agents (practical rules)
59+
- Preserve header/implementation split; add `.ipp` files for template
60+
implementations. Match namespaces: `traccc::...` and directory
61+
structure -> namespace mapping.
62+
- Add compile-time feature guards using the module's CMake options when touching
63+
device/backends (e.g. `TRACCC_BUILD_CUDA`). Update `CMakeLists.txt` entries if
64+
new files are added to a target.
65+
- When adding new public APIs, update `core/CMakeLists.txt` or relevant module
66+
CMake to include headers and sources in the `traccc_add_library`/target
67+
definitions.
68+
- Keep changes minimal and provide a small test or example in `examples/` or add
69+
a unit test under `tests/` to exercise the new code.
70+
- Add copyright notices to new files following existing patterns.
71+
72+
Files to reference during development
73+
- Top-level: `README.md`, `CMakeLists.txt`, `CMakePresets.json`
74+
- Core: `core/CMakeLists.txt`, `core/include/traccc/edm/`, `core/include/traccc/seeding/`, `core/include/traccc/finding/`, `core/include/traccc/fitting/`
75+
- Device: `device/cuda/`, `device/sycl/`, `device/alpaka/` (look at `include/traccc/<backend>` and `src/` for wrappers)
76+
- Vendored deps and glue: `extern/`, `cmake/` helpers (e.g. `traccc-functions.cmake`)
77+
78+
If anything is unclear or you'd like more detail (examples of header/ipp
79+
patterns, a unit test scaffold, or preferred presets for CI), ask and I'll
80+
extend this file with short examples.

device/alpaka/src/finding/combinatorial_kalman_filter.hpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/** TRACCC library, part of the ACTS project (R&D line)
22
*
3-
* (c) 2024-2025 CERN for the benefit of the ACTS project
3+
* (c) 2024-2026 CERN for the benefit of the ACTS project
44
*
55
* Mozilla Public License Version 2.0
66
*/
@@ -643,7 +643,7 @@ combinatorial_kalman_filter(
643643
.links_view = links_buffer,
644644
.tips_view = tips_buffer,
645645
.tracks_view = track_candidates_buffer,
646-
.tip_to_output_map = nullptr,
646+
.tip_to_output_map = {},
647647
.jacobian_ptr = jacobian_ptr.get(),
648648
.link_predicted_parameter_view =
649649
link_predicted_parameter_buffer,

device/common/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,13 +75,15 @@ traccc_add_library( traccc_device_common device_common
7575
"include/traccc/finding/device/gather_measurement_votes.hpp"
7676
"include/traccc/finding/device/fill_finding_propagation_sort_keys.hpp"
7777
"include/traccc/finding/device/propagate_to_next_surface.hpp"
78+
"include/traccc/finding/device/update_tip_length_buffer.hpp"
7879
"include/traccc/finding/device/impl/apply_interaction.ipp"
7980
"include/traccc/finding/device/impl/build_tracks.ipp"
8081
"include/traccc/finding/device/impl/find_tracks.ipp"
8182
"include/traccc/finding/device/impl/gather_best_tips_per_measurement.ipp"
8283
"include/traccc/finding/device/impl/gather_measurement_votes.ipp"
8384
"include/traccc/finding/device/impl/fill_finding_propagation_sort_keys.ipp"
8485
"include/traccc/finding/device/impl/propagate_to_next_surface.ipp"
86+
"include/traccc/finding/device/impl/update_tip_length_buffer.ipp"
8587
# Track fitting funtions(s).
8688
"include/traccc/fitting/device/fit.hpp"
8789
"include/traccc/fitting/device/fill_fitting_sort_keys.hpp"

device/common/include/traccc/finding/device/build_tracks.hpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ struct build_tracks_payload {
4848
/**
4949
* @brief Optional mapping from tip index to output index
5050
*/
51-
const unsigned int* tip_to_output_map = nullptr;
51+
vecmem::data::vector_view<const unsigned int> tip_to_output_map;
5252
bound_matrix<default_algebra>* jacobian_ptr = nullptr;
5353
bound_track_parameters_collection_types::view link_predicted_parameter_view;
5454
bound_track_parameters_collection_types::view link_filtered_parameter_view;

device/common/include/traccc/finding/device/impl/build_tracks.ipp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,10 @@ TRACCC_HOST_DEVICE inline void build_tracks(
4545
return;
4646
}
4747

48-
const unsigned int output_idx = payload.tip_to_output_map != nullptr
49-
? payload.tip_to_output_map[globalIndex]
48+
const vecmem::device_vector<const unsigned int> tip_to_output_map(
49+
payload.tip_to_output_map);
50+
const unsigned int output_idx = tip_to_output_map.size()
51+
? tip_to_output_map.at(globalIndex)
5052
: globalIndex;
5153

5254
if (output_idx == std::numeric_limits<unsigned int>::max()) {
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
/** traccc library, part of the ACTS project (R&D line)
2+
*
3+
* (c) 2025-2026 CERN for the benefit of the ACTS project
4+
*
5+
* Mozilla Public License Version 2.0
6+
*/
7+
8+
#pragma once
9+
10+
// VecMem include(s).
11+
#include <vecmem/containers/device_vector.hpp>
12+
#include <vecmem/memory/device_atomic_ref.hpp>
13+
14+
// System include(s).
15+
#include <cassert>
16+
#include <limits>
17+
18+
namespace traccc::device {
19+
20+
TRACCC_HOST_DEVICE inline void update_tip_length_buffer(
21+
global_index_t thread_id, const update_tip_length_buffer_payload& payload) {
22+
23+
const vecmem::device_vector<const unsigned int> measurement_votes(
24+
payload.measurement_votes);
25+
if (thread_id >= measurement_votes.size()) {
26+
return;
27+
}
28+
29+
const vecmem::device_vector<const unsigned int> old_tip_length(
30+
payload.old_tip_length);
31+
vecmem::device_vector<unsigned int> new_tip_length(payload.new_tip_length);
32+
33+
const unsigned int total_measurements = old_tip_length.at(thread_id);
34+
const unsigned int total_votes = measurement_votes.at(thread_id);
35+
36+
assert(total_votes <= total_measurements);
37+
38+
const float vote_fraction = static_cast<float>(total_votes) /
39+
static_cast<float>(total_measurements);
40+
41+
vecmem::device_vector<unsigned int> tip_to_output_map(
42+
payload.tip_to_output_map);
43+
44+
if (vote_fraction < payload.min_measurement_voting_fraction) {
45+
tip_to_output_map.at(thread_id) =
46+
std::numeric_limits<unsigned int>::max();
47+
} else {
48+
const vecmem::device_vector<unsigned int>::size_type new_idx =
49+
new_tip_length.push_back(total_measurements);
50+
tip_to_output_map.at(thread_id) = new_idx;
51+
}
52+
}
53+
54+
} // namespace traccc::device
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
/** traccc library, part of the ACTS project (R&D line)
2+
*
3+
* (c) 2025-2026 CERN for the benefit of the ACTS project
4+
*
5+
* Mozilla Public License Version 2.0
6+
*/
7+
8+
#pragma once
9+
10+
// Local include(s).
11+
#include "traccc/device/global_index.hpp"
12+
13+
// Project include(s).
14+
#include "traccc/definitions/qualifiers.hpp"
15+
16+
// VecMem include(s).
17+
#include <vecmem/containers/data/vector_view.hpp>
18+
19+
namespace traccc::device {
20+
21+
/// Payload for the @c device::update_tip_length_buffer function
22+
struct update_tip_length_buffer_payload {
23+
vecmem::data::vector_view<const unsigned int> old_tip_length;
24+
vecmem::data::vector_view<unsigned int> new_tip_length;
25+
vecmem::data::vector_view<const unsigned int> measurement_votes;
26+
vecmem::data::vector_view<unsigned int> tip_to_output_map;
27+
float min_measurement_voting_fraction;
28+
};
29+
30+
TRACCC_HOST_DEVICE inline void update_tip_length_buffer(
31+
global_index_t thread_id, const update_tip_length_buffer_payload& payload);
32+
33+
} // namespace traccc::device
34+
35+
// Include the implementation.
36+
#include "traccc/finding/device/impl/update_tip_length_buffer.ipp"

device/cuda/src/finding/combinatorial_kalman_filter.cuh

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,7 @@ combinatorial_kalman_filter(
569569
}
570570

571571
vecmem::vector<unsigned int> tips_length_host(mr.host);
572-
vecmem::unique_alloc_ptr<unsigned int[]> tip_to_output_map = nullptr;
572+
vecmem::data::vector_buffer<unsigned int> tip_to_output_map;
573573

574574
unsigned int n_tips_total_filtered = n_tips_total;
575575

@@ -640,40 +640,36 @@ combinatorial_kalman_filter(
640640
}
641641

642642
tip_to_output_map =
643-
vecmem::make_unique_alloc<unsigned int[]>(mr.main, n_tips_total);
643+
vecmem::data::vector_buffer<unsigned int>(n_tips_total, mr.main);
644+
copy.setup(tip_to_output_map)->wait();
644645

645646
{
646647
const unsigned int num_threads = 512;
647648
const unsigned int num_blocks =
648649
(n_tips_total + num_threads - 1) / num_threads;
649650

650651
vecmem::data::vector_buffer<unsigned int> new_tip_length_buffer{
651-
n_tips_total, mr.main};
652+
n_tips_total, mr.main, vecmem::data::buffer_type::resizable};
652653
copy.setup(new_tip_length_buffer)->wait();
653654

654-
auto tip_to_output_map_idx =
655-
vecmem::make_unique_alloc<unsigned int>(mr.main);
656-
657-
TRACCC_CUDA_ERROR_CHECK(cudaMemsetAsync(
658-
tip_to_output_map_idx.get(), 0, sizeof(unsigned int), stream));
659-
660655
kernels::update_tip_length_buffer<<<num_blocks, num_threads, 0,
661656
stream>>>(
662-
tip_length_buffer, new_tip_length_buffer, votes_per_tip_buffer,
663-
tip_to_output_map.get(), tip_to_output_map_idx.get(),
664-
config.min_measurement_voting_fraction);
657+
{tip_length_buffer, new_tip_length_buffer, votes_per_tip_buffer,
658+
tip_to_output_map, config.min_measurement_voting_fraction});
665659

666660
TRACCC_CUDA_ERROR_CHECK(cudaGetLastError());
667661

668-
str.synchronize();
669-
670-
TRACCC_CUDA_ERROR_CHECK(cudaMemcpyAsync(
671-
&n_tips_total_filtered, tip_to_output_map_idx.get(),
672-
sizeof(unsigned int), cudaMemcpyDeviceToHost, stream));
662+
if (mr.host) {
663+
vecmem::async_size size =
664+
copy.get_size(tip_to_output_map, *(mr.host));
665+
// Here we could give control back to the caller, once our code
666+
// allows for it. (coroutines...)
667+
n_tips_total_filtered = size.get();
668+
} else {
669+
n_tips_total_filtered = copy.get_size(tip_to_output_map);
670+
}
673671

674672
tip_length_buffer = std::move(new_tip_length_buffer);
675-
676-
str.synchronize();
677673
}
678674
}
679675

@@ -709,7 +705,7 @@ combinatorial_kalman_filter(
709705
.links_view = links_buffer,
710706
.tips_view = tips_buffer,
711707
.tracks_view = {track_candidates_buffer},
712-
.tip_to_output_map = tip_to_output_map.get(),
708+
.tip_to_output_map = tip_to_output_map,
713709
.jacobian_ptr = jacobian_ptr.get(),
714710
.link_predicted_parameter_view = link_predicted_parameter_buffer,
715711
.link_filtered_parameter_view = link_filtered_parameter_buffer,
Lines changed: 5 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,20 @@
11
/** traccc library, part of the ACTS project (R&D line)
22
*
3-
* (c) 2025 CERN for the benefit of the ACTS project
3+
* (c) 2025-2026 CERN for the benefit of the ACTS project
44
*
55
* Mozilla Public License Version 2.0
66
*/
77

8+
// Local include(s).
9+
#include "../../utils/global_index.hpp"
810
#include "update_tip_length_buffer.cuh"
911

1012
namespace traccc::cuda::kernels {
1113

1214
__global__ void update_tip_length_buffer(
13-
const vecmem::data::vector_view<const unsigned int> old_tip_length_view,
14-
vecmem::data::vector_view<unsigned int> new_tip_length_view,
15-
const vecmem::data::vector_view<const unsigned int> measurement_votes_view,
16-
unsigned int* tip_to_output_map, unsigned int* tip_to_output_map_idx,
17-
float min_measurement_voting_fraction) {
18-
const unsigned int thread_idx = blockIdx.x * blockDim.x + threadIdx.x;
15+
const device::update_tip_length_buffer_payload payload) {
1916

20-
assert(tip_to_output_map != nullptr);
21-
assert(tip_to_output_map_idx != nullptr);
22-
23-
const vecmem::device_vector<const unsigned int> old_tip_length(
24-
old_tip_length_view);
25-
vecmem::device_vector<unsigned int> new_tip_length(new_tip_length_view);
26-
const vecmem::device_vector<const unsigned int> measurement_votes(
27-
measurement_votes_view);
28-
29-
if (thread_idx >= measurement_votes_view.size()) {
30-
return;
31-
}
32-
33-
const unsigned int total_measurements = old_tip_length.at(thread_idx);
34-
const unsigned int total_votes = measurement_votes.at(thread_idx);
35-
36-
assert(total_votes <= total_measurements);
37-
38-
const scalar vote_fraction = static_cast<scalar>(total_votes) /
39-
static_cast<scalar>(total_measurements);
40-
41-
if (vote_fraction < min_measurement_voting_fraction) {
42-
tip_to_output_map[thread_idx] =
43-
std::numeric_limits<unsigned int>::max();
44-
} else {
45-
const auto new_idx =
46-
vecmem::device_atomic_ref(*tip_to_output_map_idx).fetch_add(1u);
47-
new_tip_length.at(new_idx) = total_measurements;
48-
tip_to_output_map[thread_idx] = new_idx;
49-
}
17+
device::update_tip_length_buffer(details::global_index1(), payload);
5018
}
5119

5220
} // namespace traccc::cuda::kernels
Lines changed: 5 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,18 @@
11
/** traccc library, part of the ACTS project (R&D line)
22
*
3-
* (c) 2025 CERN for the benefit of the ACTS project
3+
* (c) 2025-2026 CERN for the benefit of the ACTS project
44
*
55
* Mozilla Public License Version 2.0
66
*/
77

8-
#include <vecmem/containers/device_vector.hpp>
9-
#include <vecmem/containers/vector.hpp>
8+
#pragma once
109

11-
#include "traccc/definitions/common.hpp"
12-
#include "traccc/device/array_insertion_mutex.hpp"
13-
#include "traccc/edm/measurement_collection.hpp"
14-
#include "traccc/finding/candidate_link.hpp"
15-
#include "traccc/finding/finding_config.hpp"
16-
#include "traccc/utils/prob.hpp"
10+
// Project include(s).
11+
#include "traccc/finding/device/update_tip_length_buffer.hpp"
1712

1813
namespace traccc::cuda::kernels {
1914

2015
__global__ void update_tip_length_buffer(
21-
const vecmem::data::vector_view<const unsigned int> old_tip_length_view,
22-
vecmem::data::vector_view<unsigned int> new_tip_length_view,
23-
const vecmem::data::vector_view<const unsigned int> measurement_votes_view,
24-
unsigned int* tip_to_output_map, unsigned int* tip_to_output_map_idx,
25-
float min_measurement_voting_fraction);
16+
const device::update_tip_length_buffer_payload payload);
2617

2718
} // namespace traccc::cuda::kernels

0 commit comments

Comments
 (0)